Andrei Panferov

Andrei Panferov

ML Research Scientist

Yandex Research

Biography

I’m a research scientist at Yandex Research. My research interests include natural language processing, efficient deep learning and federated learning. I’m currently a final year bachelor’s student at Moscow Institute of Physics and Technology (MIPT).

Interests
  • Natural Language Processing
  • Efficient Deep Learning
  • Federated Learning
Education
  • PhD in Computer Science, 2024-now

    ISTA

  • BSc in Applied Mathematics and Physics, 2020-2024

    Moscow Institute of Physics and Technology

  • Machine Learning Engineer, 2021-2023

    Yandex School of Data Analysis

Experience

 
 
 
 
 
Wildberries
Senior ML Engineer (NLP)
April 2024 – Present Remote
  • Overseeing the LLM deployment
 
 
 
 
 
Yandex Research
ML Research Scientist
November 2023 – February 2024 Moscow, Russia
  • Wrote a first-author paper on LLM Compression
  • Achieved state-of-the-art results on LLM compression, reducing model size by 87% with acceptable loss in performance
  • Wrote efficient inference kernels using Triton and C++, speeding up LLM inference by up to 320%
  • Integrated the framework into the transformers library, enabling low RAM dispatch and reducing instance RAM requirements by 70%
 
 
 
 
 
KAUST, Optimization and Machine Learning Lab
Research Intern
KAUST, Optimization and Machine Learning Lab
July 2023 – September 2023 Saudi Arabia
 
 
 
 
 
Infrastructure Developer
Eqvilent (HFT Fund)
July 2022 – March 2023 Remote
 
 
 
 
 
Yandex
ML Engineer Intern (NLP)
Yandex
March 2022 – July 2022 Moscow, Russia
  • Enabled abstract tabular data insertion for efficient map-reduce LLM inference, speeding up the tabular data processing by 120%
  • Increased test coverage of the map-reduce inference interface from 0 to 85% through rigorous unit testing
 
 
 
 
 
Terra Quantum AG
Researcher
July 2020 – July 2022 Moscow, Russia
  • Researched quantum algorithms for business applications.
  • Developed an NMR spectra analysis toll, allowing for its use for for quantum computations.
  • Optimized LLM deployment for chat assistant applications, reducing latency by 40%.

Accomplish­ments

Israel
International Physics Olympiad
Gold Medal

Projects

*
Example Project
An example of using the in-built project page.
Example Project