Akira’s Machine Learning News — Issue #36
Featured Paper/News in This Week.
- SimMIM, a model for image pre-training with a structure similar to a masked language model, has been presented. The concept is similar to MAE introduced last week, but this is a simpler implementation. It may become more common to use masks to pre-train images.
- The Foundation model (a large model that can handle many tasks) was announced. While powerful pre-training models such as GPT-3 have been used as the foundation model for NLP, there have not been many such models for images.
— — — — — — — — — — — — — — — — — — –
In the following sections, I will introduce various articles and papers not only on the above contents but also on the following five topics.
- Featured Paper/News in This Week
- Machine Learning Use Case
- Papers
- Articles related to machine learning technology
— — — — — — — — — — — — — — — — — — –
1. Featured Paper/News in This Week
Self-supervised learning methods such as masked language models using images. — arxiv.org
[2111.09886v1] SimMIM: A Simple Framework for Masked Image Modeling
This research uses transformers to pre-train by performing the task of predicting masked areas of an image using regression. It has a very simple structure and can perform better than existing self-supervised learning methods such as DINO.
Foundation model for image-based tasks — arxiv.org
[2111.11432] Florence: A New Foundation Model for Computer Vision
Proposed Florence, a Foundation model that can be trained on large data sets and applied to various image-based tasks; integrates CLIP-like language and image training with models applied to each task. It achieved SotA performance on 44 benchmarks.
— — — — — — — — — — — — — — — — — — –
2. Machine Learning use case
Controlling with Deep Learning Models
An article on techniques for controlling robots using deep learning models. It states that it is important to incorporate prior physical knowledge and invariants into the model.
— — — — — — — — — — — — — — — — — — –
3. Machine Learning Papers
You can do a lot with just StyleGAN — arxiv.org
[2111.01619] StyleGAN of All Trades: Image Manipulation with Only Pretrained StyleGAN
In recent years, various models have been developed based on StyleGAN with various improvements. In this paper, however, the authors show that StyleGAN can perform various tasks such as panoramic image generation, image mixing, and generation from a single image by adding simple operations to the pre-trained StyleGAN.
Which is more effective, increasing the number of samples or increasing the number of parameters? — arxiv.org
[2110.04374] A Few More Examples May Be Worth Billions of Parameters
A study to see if increasing the number of samples or the model size is more effective for the task. They found that is task dependent ; that for classification and extractive QA, 2000 samples may be comparable to the billion parameters model, but not for OpenQA. The authors hypothesize that the reason for this is that the number of samples presented may be effective when the output space is limited.
The performance of the downstream task does not necessarily match the performance of that of pre-training . — arxiv.org
[2110.02095] Exploring the Limits of Large Scale Pre-training
They showed in more than 4800 experiments that the performance of the pre-training and the performance of the downstream task do not improve linearly and is saturated. Furthermore, they showed that the best model for one downstream task is not the best model for another downstream task.
SAM is effective even for language models — arxiv.org
[2110.08529] Sharpness-Aware Minimization Improves Language Model Generalization
A paper that confirms the effectiveness of SAM, an optimization method that takes into account the sharpness of the loss function, in language tasks. They showed that the effect can be obtained regardless of the data set or model size. It was also confirmed that the effect was greater when the data set was smaller.
— — — — — — — — — — — — — — — — — — –
4. Technical Articles
Interpretability of Models — thegradient.pub
An article on explainability in machine learning. It states that explanatory properties are necessary for robustness, privacy, and fairness, and mentions what explanatory properties are currently available, as well as limitations such as learning pseudo-correlations.
— — — — — — — — — — — — — — — — — — –
Other Blogs
— — — — — — — — — — — — — — — — — — –
About Me
Manufacturing Engineer/Machine Learning Engineer/Data Scientist / Master of Science in Physics / http://github.com/AkiraTOSEI/
Twitter, I post one-sentence paper commentary.