Akira’s Machine Learning News — Issue #38

4 min readDec 23, 2021

Featured Paper/News in This Week.

A 3D-Transformer that can be directly applied to molecular structures has been proposed. The attention weights can be adjusted according to the interatomic distances, and the computational complexity does not seem to be that high.

— — — — — — — — — — — — — — — — — — –

In the following sections, I will introduce various articles and papers not only on the above contents but also on the following five topics.

Featured Paper/News in This Week
Machine Learning Use Case
Papers
Articles related to machine learning technology

— — — — — — — — — — — — — — — — — — –

1. Featured Paper/News in This Week

Transformer model applicable to three-dimensional molecular structures — arxiv.org

[2110.01191] 3D-Transformer: Molecular Representation with Transformer in 3D Space
They proposed 3D-Transformer, which is a transformer model applicable to three-dimensional structures of molecules. Since molecules have different interatomic distances depending on the nature of the target atoms, they proposed multi-scale Self-Attention, which adjusts the application of Attention according to the distance, and AFPS, which downsamples according to the Attention score. It showed good performance on quantum chemical molecular data sets.

— — — — — — — — — — — — — — — — — — –

2. Machine Learning use case

NVIDIA has released GauGAN2, which can generate images from text

NVIDIA Research's GauGAN AI Art Demo Responds to Words | NVIDIA Blog

A picture worth a thousand words now takes just three or four words to create, thanks to GauGAN2, the latest version of…

blogs.nvidia.com

NVIDIA has released GauGAN2, which can generate images from text. You can input a text and it will generate an image accordingly. You can try it out on the demo page (http://gaugan.org/gaugan2/).

— — — — — — — — — — — — — — — — — — –

3. Machine Learning Papers

GAN inversion adds meaning to the latent space — arxiv.org

[2004.00049] In-Domain GAN Inversion for Real Image Editing
In the problem of obtaining latent variables from an image using a pre-trained GAN, this research not only finds latent variables to reconstruct the image, but also makes the obtained latent space meaningful. In addition to using an encoder to map the image to the latent space as in conventional methods, they also use a discriminator and apply regularization so that the features acquired by VGG do not change. It is possible to change the semantic domain by adjusting the latent space.

Using Hierarchical Encoders in GAN Inversion — arxiv.org

[2104.07661] A Simple Baseline for StyleGAN Inversion
Proposes a method of using hierarchical encoders in GAN Inversion to infer the embedded representation of a pre-trained StyleGAN for an image. It is a multi-stage learning method that progressively predicts the residuals of the latent code through multiple passes of the encoder. The proposed method significantly outperforms the existing forward-based methods.

Proposed an embedding algorithm for StyleGAN and investigated the quality of its embedding representation — arxiv.org

[1904.03189] Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?
A study that proposes an embedding algorithm for StyleGAN and investigates the quality of its embedding representation. The image is embedded by minimizing the MSE and Perceptual loss. They confirmed that this embedding can be used for morphing and Style Transfer.

— — — — — — — — — — — — — — — — — — –

4. Technical Articles

Explanatory article on the diffusion model

What are Diffusion Models?

Diffusion models are a new type of generative models that are flexible enough to learn any arbitrarily complex data…

lilianweng.github.io

Explanatory article on the diffusion model. The article explains the core concept of the diffusion model with a lot of illustrations. There are a lot of mathematical equations, but the explanation is very detailed and easy to understand.

— — — — — — — — — — — — — — — — — — –

Other Blogs

Machine Learning 2020 summary: 84 interesting papers/articles

In this article, I present a total of 84 papers and articles published in 2020 that I found particularly interesting…

towardsdatascience.com

Recent Developments and Views on Computer Vision x Transformer

On the differences between Transformer and CNN, why Transformer matters, and what its weaknesses are.

towardsdatascience.com

Reach and Limits of the Supermassive Model GPT-3

In this blog post, I will give a technical explanation of GPT-3 , what GPT-3 have achieved , and what GPT-3 could not…

medium.com

Do Vision Transformers See Like Convolutional Neural Networks? (Paper Explained)

I will take a closer look at the differences in the obtained representations between CNN and Transformers

towardsdatascience.com

— — — — — — — — — — — — — — — — — — –

About Me

Manufacturing Engineer/Machine Learning Engineer/Data Scientist / Master of Science in Physics / http://github.com/AkiraTOSEI/

LinkedIn profile

Twitter, I post one-sentence paper commentary.

Akira’s Machine Learning News — Issue #38

1. Featured Paper/News in This Week

2. Machine Learning use case

NVIDIA has released GauGAN2, which can generate images from text

NVIDIA Research's GauGAN AI Art Demo Responds to Words | NVIDIA Blog

A picture worth a thousand words now takes just three or four words to create, thanks to GauGAN2, the latest version of…

3. Machine Learning Papers

4. Technical Articles

Explanatory article on the diffusion model

What are Diffusion Models?

Diffusion models are a new type of generative models that are flexible enough to learn any arbitrarily complex data…

Other Blogs

Machine Learning 2020 summary: 84 interesting papers/articles

In this article, I present a total of 84 papers and articles published in 2020 that I found particularly interesting…

Recent Developments and Views on Computer Vision x Transformer

On the differences between Transformer and CNN, why Transformer matters, and what its weaknesses are.

Reach and Limits of the Supermassive Model GPT-3

In this blog post, I will give a technical explanation of GPT-3 , what GPT-3 have achieved , and what GPT-3 could not…

Do Vision Transformers See Like Convolutional Neural Networks? (Paper Explained)

I will take a closer look at the differences in the obtained representations between CNN and Transformers

About Me

Written by Akihiro FUJII