论文笔记
🗒️Geometric Deep Learning
00 分钟
2024-5-22
2024-6-10
type
status
date
slug
summary
tags
category
icon
password

Geometric deep learning: going beyond Euclidean data

关于论文的一些具体细节和感受记录在了下面的笔记中:

Insights

总体来说,这篇论文讲述了将Euclidean domain的CNN迁移到non-Euclidean domain,从而解决struture of the domain和data of the domain的问题,数学推导部分占比很高。
考虑到Euclidean domain具有stability(特征均匀)、stationarity(局部变换平稳)和compositionality(可以使用mutilscale的方式),由filter(局部特征提取,参数共享)和pooling(mutilscale)构成,但non-Euclidean domain保留相关的性质其实并不多,所以在需要迁移过程中的解决一些问题。对于manifold和graph而言,在Hilbert空间上定义了运算符,使得后续的公式有实际的作用范围。正如CNN用Fourier来作为bases,non-Euclidean里面使用Laplacian eigenvector作为bases,但这两者的bases在定义上就有本质的不同,前者由sin-cos构成,有FFT等方法快速计算,后者并不具备,因此便有一系列的方法,如GNN,ChebNet等对于其运算复杂度进行降低。
另外,我也在想为什么需要在不直观的spectral domain解决问题呢?主要是spectral domain可以捕捉内在特性(内在具有的高位结构),同时谱方法提供了多尺度的分析视角、也不依赖于空间排序(提高泛化性)。spatial domain的方法在捕捉局部特征、直观理解运算、动态处理数据集变化等方面会有更好的表现。
同时,在spectral domain中,由于Laplacian eigenvector是domain-dependent,因此对于instance而言的变化(对应的manifold会变化)这样的动态无法进行很好的捕捉(如人物身体上的点在不同动作下的manifold),故使用attention等方式,考虑将运动前后的进行融合,让其学到对应的点位关系,使得表征具有局部对应性。但相反,在spatial domain,对于同一物体变化的表征相对好捕捉,但其表示方式并不是很好,现在是使用patch operator来实现局部的表示。
现有的研究还是在于多个方向,如图生成、点云生成(考虑使用相应的表征进行操作),如对于有向图的处理(需要对于权重矩阵进行修正),如解决动态图的追踪等。感觉上述理论,为GNN等提供了较为强有力的解释性分析,较为严谨,比较感兴趣。此外就是脱离3D点云等视觉任务,将相应的技术应用于脑科学等领域(AI4Science),这个也比较好(将graph用于实际任务中)。

Key Notes

  • basics of Euclidean convolutinal neural networks
    • CNN
      • key properties:convolutional(translational invariance,stationary,通过卷积可以实现),scale separation(compositionality,可以将scale进行拆分,multiscale structure ,通过downsampling可以实现),filters localized in space(deformation stability,稍微形变的输入对应相同的输出,inductive bias), parameters per filter(independent of input image size n), complexity per layer(filter done in spatial domain), layers in classification tasks(not exposed to the curse of the dimension)
      • CNN主要包含pooling和filtering两个部分,因此后续也是对于这两个部分的迁移,从spectral domain和spatial domain两个方面进行考虑
  • basics of graph theory
    • prototypical non-Euclidean objects:manifolds and graphs
    • challenges of geometric deep learning:extend NN techniques to graph- or manifold-structured data
      • assumption:non-Euclidean data are locally stationary and manifest hierachical structures(Geometric Priors exist beyond Euclidean domains.)
      • how to define composition?(convolution and pooling on graphs/manifolds)
      • how to make them fast?
    • graph
      • Hilbert space with inner product (Hilbert空间为完备内积空间,使得讨论函数的极限成为可能,同时内积可以辅助定义大小和相似性,并为流形上使用泛函分析(研究无穷维空间中的函数性质)工具提供可能)
        • 在Hilbert空间中有Laplacian operator(是一种self-adjoint operator,),可以使用谱理论(spectral theory)对于其eigenvalue和eigenfunction进行分析利用
          • 后一项up to scale,表示difference between f and its local average,在graph上为半正定矩阵,可以由式子所定义,计算其Dirichelet energy可以用来表示smoothness of f (how fast it changes locally)
    • manifold
      • topological space,tangent plane表示x周围的切面
      • Riemannian metric:local intrinsic structure at
        • scalar fileds,vector fields并在Hilbert space中计算两个函数的内积(Riemannnian metric)
        • Laplacian :由gradient 和divergence 共同定义
          • continuous limit of graph (当流形网格离散化discretized,Laplacian会满足一定条件)
          • Dirichelet energy,将其化为优化问题使得最平滑,求得解与Laplacian的orthogonal特征向量与其对应特征值有关
    • fourier analysis on graphs
      • Fourier basis = Laplacian eigenfunctions
      • Euclidean space convolution
        • shift-invariance,convolution operator diagonalized(by EVD,computed as times in Fourier domain)
        • efficienctly calculated by FFT
      • spetral convolution
        • not shift-invariant 不是circulant matrix,不像傅里叶变换的sin/cos)
        • filter coefficients basis dependent (not generalize across graphs)
    • geometric learning problems
      • characterize the structure(structure of the domain)
        • Spectral methods:Spectral CNN (SCNN),Spectral CNN with smooth spectral multipliers
        • Spectral-free methods:Graph Neural Network (GNN),Graph Convolutional Network (GCN),ChebNet
      • analyzing functions(fixed/multiple)
        • Charting-based methods:Anisotropic CNN,Mixture model network (MoNet)
        • combined spatial/spectral methods:Wavlets,Localized spectral CNN(LSCNN)
    • spectral-domain methods
      • spectral graph CNN
        • filter coefficients basis dependent (not generalize across graphs)
        • parameters layer, computational of forward and inverse Fourier transforms(no FFT on graphs),no gaurantee of spatial localization of filters(W并不具备这种性质)
      • localization in space = smoothness in frequency domain(由vanishing moments推出)
        • parametrize the filter using a smooth spectral transfer function,apply on the filter
          • ChebNet:polynomial or order r
            • parameters layer(只有一个polynomial parameter),filters have garanteed -hops support, computational complexity (不需要直接计算特征向量)
      • pooling:produce a sequence of coarsened graphs
      • limitations
        • poor generalization across non-isometric domains (unless kernel are localized,with attention maybe)
        • spectral kernels are isotropic (Laplacian rotation invariance,无法具备各向异性)
        • only undirected graphs(Laplacian matrix is symmetrical)
      • spectral-free
        • spatial constructions that can eat structure dynamically
        • input set:invariant to permutations,dynamic resizing(mean can be a very strong baseline)
        • Graph Neural Networkv
          • 就是简单的在h上将周围的与其本身进行聚合,可以迭代的使用对于来自不同层的信息进行聚合,从而实现对于全图的特征的聚合
          • decorate:edge,vertex
          • others:skip connections,multiplicative interactions/gating,interaction(with learnable ,aka,neural message passing)
          • 相比于CNN,GNN无法捕捉方向(因为是iostropic),less expressive
            • 使用edge filters、mutiple non-isotropic weight matrix、vertex decoration(like coordinates)
    • spatial-domain methods
      • charting-based
      • 上面得到的是在spectral domain的卷积,需要将spatial domain与其建立联系
        • patch operators,geodesic polar coordinates(极坐标表示一个patch中的坐标),spatial convolution可以用下面的式子表示:
          • 引入对于patch operator的相关参数,可以得到 learnable patch operator,若在构造权重时使用Gaussian mixture到函数中则可以得到Mixture model network(MoNet),但在低维上缺少可解释性;CNN可以看作是特殊形式的MoNet
        • application
          • deformable 3D correspondence
            • isometric(同样的人物换一个动作)、non-isometric、partial
            • local feature learning:Siamese net,捕捉两个manifold中对应的点对让其差距尽可能小,但该方法(1)train的过程中无法很好的从空间关系上预测的和真实的之间的差距,(2)test时无法保证空间上相邻的点预测页相邻
              • 引入soft correspondence error(probability-weighted geodesic distance)
              • intrinsic structured prediction(FMNet):使用soft correspondence layer(联合二者的特征向量,functional mapping可以获得可能的对应的区域),增加functional map layer聚合两者信息,在cost中进行相应的补充
                • notion image
          • matrix completion and recommend system
            • 如行表示user,列表示商品,目的是使得矩阵rank最小 (通常是通过convex box/nuclear normal)
            • multi-graph Forier transform:可以引入user之间、商品之间的关系到loss中,使得相似的人推荐相似的商品(即考虑平滑度smoothness
              • variable instead of ,将X分解成两个小规模的矩阵
            • matrix completion with recurrent multi-graph CNN(RMCNN)迭代式的补全
          • particle pyhysics and chemistry
            • 预测分子的属性,如atomization energy等,有DFT(density functional theory)去求解,但是计算量很大,故考虑使用GCN,从data-driven的角度对方案进行替代
            • use degree-conditional parameters and no hidden edge features,learns edge features at each layer
          • TSP问题(QAP,quadratic assignment problem,二次分配问题)
            • NP-hard,目的是找到顺序
            • data-driven GNN:随机初始化一个图,将quadratic转换到linear assignment,使用GNN siamese embedding architecture,目前在考虑是否可以有end-to-end shape correspondence的可以用
              • notion image
        • open problem and future direction
          • from handcrafted feature to learnable method
          • generalization:pooling & filter,cross domain
            • spectral methods:spatial representation is domain-dependent
              • applying spatial transformer networks in the spectral domain,by Laplacian diagonalization construct compatible orthogonal bases
            • spatial methods:construct low-dimensional local spatial coordinates on graphs turns to be challenging(particularly on anisotropic diffusion on general graphs)
            • spectrum-free:GNN with non-linearity or learned parameters \mathbf{\theta},simulating a high power of diffusion
          • time-varing domains:track how changes affect signals(abnormal activity detection,3D dynamic shapes)
          • directed graphs:non-symmetric Laplacian matrices(with no have orthogonal eigendecomposition)
          • scaling up models to millions of nodes(graph matrix fixed)
            • synthesis problems
              • graph sythesis(graphVAE)
              • shape synthesis(使用geometric autoencoder进行点云补全,shape spatio-temporal prediction with GDL,关键在于reconstruct the geometric structure from some intrinsic representation)
          • computation:GPU hardware design to satisfy the graph structure
        • references
            1. M. M. Bronstein, J. Bruna, Y. LeCun, A. Szlam, P. Vanderghenyst, “Geometric deep learning: going beyond Euclidean data”, arXiv:1611.08097, 2016. First review paper of geometric deep learning.
            1. https://www.youtube.com/watch?v=LvmjbXZyoP0 Tutorial on graph and manifold research by Michael Bronstein
            1. https://www.youtube.com/watch?v=LeeUzusWz5g Geometric Deep Learning: Past, Present, And Future, by Michael Bronstein
            1. M. Gori, G. Monfardini, F. Scarselli, “A new model for learning in graph domains”, Proc. IJCNN 2005. First neural net on graphs (GNN framework).
            1. F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, G. Monfardini, “The graph neural network model”, IEEE Trans. Neural Networks 20(1):61-80, 2009.
            1. J. Bruna, W. Zaremba, A. Szlam, Y. LeCun, “Spectral networks and locally connected networks on graphs”, Proc. ICML 2014. First Spectral CNN on graphs.
            1. M. Henaff, J. Bruna, and Y. LeCun, “Deep convolutional networks on graph-structured data”, arXiv:1506.05163, 2015. Spectral CNN with smooth multipliers.
            1. J. Masci, D. Boscaini, M. M. Bronstein, P. Vanderghenyst, “Geodesic convolutional neural networks on Riemannian manifolds”, arXiv:1501.06297v2, 2015. First spatial CNN on manifolds (Geodesic CNN framework).

        Videos

        start from1:03:58 Robustness and guaranteed performance
        Geometric Deep Learning - Michael Bronstein - MLSS 2020, Tübingen
        Table of Contents (powered by https://videoken.com) 0:00:00 Speaker Introduction 0:02:04 Geometric Deep Learning - going beyond Euclidean data 0:02:36 Motivation 0:02:48 Perceptron 0:03:03 Multilayer perceptron 0:03:20 Multilayer perceptron = universal approximator 0:03:57 Curse of dimensionality 0:05:31 Computer vision problems 0:06:17 Take advantage of structure! 0:06:54 Convolutional Neural Networks 0:07:35 Inductive biases 0:08:20 energy U = ? 0:10:43 Geometric Deep Learning 0:10:57 Prototypical objects 0:11:16 Images and Graphs 0:12:24 Convolution 0:12:38 Fully connected layer 0:13:06 Sparsely connected layer 0:13:21 Convolutional layer 0:14:24 Shift-equivariance 0:14:43 Joint diagonalization 0:15:03 Eigenvectors of ST = Fourier transform 0:15:15 Convolution in spectral domain 0:15:36 Convolution theorem 0:16:13 Key insights 0:17:12 From grids to graphs 0:17:15 Graphs 0:17:47 Attributes 0:18:12 Adjacently matrix 0:18:22 Weighted adjacently matrix 0:18:39 Graph Laplacian 0:19:23 Dirichlet energy 0:19:45 Laplacian on manifolds and meshes 0:20:28 Convolution on graphs? 0:20:53 Graph Fourier transform 0:21:03 1D grid = ring graph 0:21:30 From Grid to Graph Fourier Transform 0:22:52 Spectral graph convolution 0:24:32 Spectral graph convolution: drawbacks 0:24:54 Basis dependence 0:25:45 Isotropic filters on graphs 0:26:38 Anisotropic filters on meshes 0:27:14 Spectral graph convolution, take 2 0:28:16 Polynomial filter (ChebNet) 0:29:46 Spatial graph convolution 0:30:09 Convolution, revisited 0:31:01 Graph Convolutional Networks (GCN) 0:33:03 Pooling 0:33:10 Graph coarsening 0:34:11 Deep graph neural networks 0:34:25 Challenges of going deep on graphs 0:35:33 Does depth help? 0:36:41 Simplified GCN (SGCN) 0:37:21 SIGN: Scalable Inception-like Graph Neural network 0:38:30 SIGN scalability 0:39:01 Do we need depth for graphs? 0:39:35 For what graphs do we need depth? 0:43:06 Different recipes for graph convolution 0:43:53 MoNet 0:44:24 Graph Attention Networks (GAT) 0:45:14 Message Passing Neural Network (MPNN) 0:46:19 How powerful are graph neural networks? 0:46:55 Graph isomorphism 0:47:47 Weisfeiler-Lehman test 0:49:20 How powerful is WL test? 0:49:57 Message passing and WL tests 0:53:05 How powerful are GSN? 0:54:09 What substructure to use? 0:54:37 Substructures in chemistry 0:55:34 Theory beyond Weisfeiler-Lehman 0:57:08 What's next? 0:58:16 The road ahead 1:00:51 Dynamic graphs 1:01:22 Temporal Graph Networks 1:02:01 Higher-order structures 1:03:58 Robustness and guaranteed performance 1:05:21 ICLR 2020 submissions keyword statistics 1:05:44 Graphs = systems of relations and interactions 1:08:10 Recommender systems and link prediction 1:09:01 Learning the graph 1:10:32 High-energy physics 1:11:30 Neutrino detection 1:12:22 Computational chemistry and drug design 1:14:26 Hyperfoods 1:16:59 Combinatorial drug therapy 1:18:42 Protein science and cancer immunotherapy 1:19:17 De novo protein design 1:20:15 3D vision and graphics 1:20:55 Shape analysis and synthesis 1:21:15 Classical and Geometric (intrinsic) convolution 1:21:46 Deformable shape correspondence 1:22:01 3D generative models 1:22:24 3D hand reconstruction from 2D images 1:23:33 CVPR 2020 = Geometry + Deep Learning 1:24:12 Conclusions 1:25:54 Q&A
        Geometric Deep Learning - Michael Bronstein - MLSS 2020, Tübingen

        Geometric Deep Learning: Past, Present, And Future, by Michael Bronstein
        Seminar by Michael Bronstein at the UCL Centre for AI. Recorded on the 3rd February 2021. Abstract Geometric deep learning has recently become one of the hottest topics in machine learning, with its particular instance, graph neural networks, being used in a broad spectrum of applications ranging from 3D computer vision and graphics to high energy physics and drug design. Despite the promise and a series of success stories of geometric deep learning methods, we have not witnessed so far anything close to the smashing success convolutional networks have had in computer vision. In this talk, I will outline my views on the possible reasons and how the field could progress in the next few years. Bio Michael Bronstein is a professor at Imperial College London, where he holds the Chair in Machine Learning and Pattern Recognition, and Head of Graph Learning Research at Twitter. He also heads ML research in Project CETI, a TED Audacious Prize-winning collaboration aimed at understanding the communication of sperm whales. Michael received his PhD from the Technion in 2007. He has held visiting appointments at Stanford, MIT, Harvard, and Tel Aviv University, and has also been affiliated with three Institutes for Advanced Study (at TU Munich as a Rudolf Diesel Fellow (2017-2019), at Harvard as a Radcliffe fellow (2017-2018), and at Princeton as a visitor (2020)). Michael is the recipient of five ERC grants, two Google Faculty Research Awards, and two Amazon AWS ML Research Awards. He is a Member of the Academia Europaea, Fellow of IEEE, IAPR, and ELLIS, ACM Distinguished Speaker, and World Economic Forum Young Scientist. In addition to his academic career, Michael is a serial entrepreneur and founder of multiple startup companies, including Novafora, Invision (acquired by Intel in 2012), Videocites, and Fabula AI (acquired by Twitter in 2019). He has previously served as Principal Engineer at Intel Perceptual Computing and was one of the key developers of the Intel RealSense technology.
        Geometric Deep Learning: Past, Present, And Future, by Michael Bronstein
        Geometric Deep Learning
        Geometric Deep Learning is able to draw insights from graph data. That includes social networks, sensor networks, the entire Internet, and even 3D Objects (if we consider point cloud data to be a graph). I'll explain how it works via a demo of me using a graph convolutional network to classify people by their interest in sports teams as well as a 3D object classification demo. At its core, it comes down to being able to learn from non-Euclidean data. Euclid's laws help define certain types of data, so I'll cover some geometry background as well. Enjoy! Code for this video: https://github.com/llSourcell/pytorch_geometric Please Subscribe! And Like. And comment. Thats what keeps me going. Want more education? Connect with me here: Twitter: https://twitter.com/sirajraval instagram: https://www.instagram.com/sirajraval Facebook: https://www.facebook.com/sirajology More learning resources: http://sungsoo.github.io/2018/02/01/geometric-deep-learning.html http://geometricdeeplearning.com/ https://arxiv.org/abs/1611.08097 http://3ddl.stanford.edu/CVPR17_Tutorial_Intrinsic_CNNs_compressed.pdf https://github.com/rusty1s/pytorch_geometric https://pemami4911.github.io/paper-summaries/deep-learning-theory/2017/11/19/geometric-deep-learning.html Join us at the School of AI: https://theschool.ai/ Join us in the Wizards Slack channel: http://wizards.herokuapp.com/ Please support me on Patreon: https://www.patreon.com/user?u=3191693 Signup for my newsletter for exciting updates in the field of AI: https://goo.gl/FZzJ5w Hiring? Need a Job? See our job board!: https://www.theschool.ai/jobs/ Need help on a project? See our consulting group: https://www.theschool.ai/consulting-group/ Hit the Join button above to sign up to become a member of my channel for access to exclusive content! Join my AI community: http://chatgptschool.io/ Sign up for my AI Sports betting Bot, WagerGPT! (500 spots available): https://www.wagergpt.co
        Geometric Deep Learning
        Geometric Deep Learning (AMMI 2022 by Michael Bronstein)
         

        评论
        Loading...