Akira’s Machine Learning News — Issue #38

Akihiro FUJII
4 min readDec 23, 2021


Featured Paper/News in This Week.

  • A 3D-Transformer that can be directly applied to molecular structures has been proposed. The attention weights can be adjusted according to the interatomic distances, and the computational complexity does not seem to be that high.

— — — — — — — — — — — — — — — — — — –

In the following sections, I will introduce various articles and papers not only on the above contents but also on the following five topics.

  1. Featured Paper/News in This Week
  2. Machine Learning Use Case
  3. Papers
  4. Articles related to machine learning technology

— — — — — — — — — — — — — — — — — — –

1. Featured Paper/News in This Week

Transformer model applicable to three-dimensional molecular structuresarxiv.org

[2110.01191] 3D-Transformer: Molecular Representation with Transformer in 3D Space
They proposed 3D-Transformer, which is a transformer model applicable to three-dimensional structures of molecules. Since molecules have different interatomic distances depending on the nature of the target atoms, they proposed multi-scale Self-Attention, which adjusts the application of Attention according to the distance, and AFPS, which downsamples according to the Attention score. It showed good performance on quantum chemical molecular data sets.

— — — — — — — — — — — — — — — — — — –

2. Machine Learning use case

NVIDIA has released GauGAN2, which can generate images from text

NVIDIA has released GauGAN2, which can generate images from text. You can input a text and it will generate an image accordingly. You can try it out on the demo page (http://gaugan.org/gaugan2/).

— — — — — — — — — — — — — — — — — — –

3. Machine Learning Papers

GAN inversion adds meaning to the latent spacearxiv.org

[2004.00049] In-Domain GAN Inversion for Real Image Editing
In the problem of obtaining latent variables from an image using a pre-trained GAN, this research not only finds latent variables to reconstruct the image, but also makes the obtained latent space meaningful. In addition to using an encoder to map the image to the latent space as in conventional methods, they also use a discriminator and apply regularization so that the features acquired by VGG do not change. It is possible to change the semantic domain by adjusting the latent space.

Using Hierarchical Encoders in GAN Inversionarxiv.org

[2104.07661] A Simple Baseline for StyleGAN Inversion
Proposes a method of using hierarchical encoders in GAN Inversion to infer the embedded representation of a pre-trained StyleGAN for an image. It is a multi-stage learning method that progressively predicts the residuals of the latent code through multiple passes of the encoder. The proposed method significantly outperforms the existing forward-based methods.

Proposed an embedding algorithm for StyleGAN and investigated the quality of its embedding representationarxiv.org

[1904.03189] Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?
A study that proposes an embedding algorithm for StyleGAN and investigates the quality of its embedding representation. The image is embedded by minimizing the MSE and Perceptual loss. They confirmed that this embedding can be used for morphing and Style Transfer.

— — — — — — — — — — — — — — — — — — –

4. Technical Articles

Explanatory article on the diffusion model

Explanatory article on the diffusion model. The article explains the core concept of the diffusion model with a lot of illustrations. There are a lot of mathematical equations, but the explanation is very detailed and easy to understand.