Hiroki Naganuma

I am a Ph.D. candidate of Computer Science at Tokyo Institute of Technology advised by Professor Rio Yokota. My research area is distributed systems and parallel computing, particularly in High Performance Computing (HPC). My research interests center around large scale parallelization and high speed in machine learning. I was working at RIKEN Center for Advanced Intelligence Project (AIP) Deep Learning Theory Team to research about non-convex optimization and parallelization of deep learning.
My CV can be found here. (Last updated: Aug 2019)

Career

Apr. 2019 ~

Apr. 2019 ~

Ph.D. student of Computer Science, Belonging to Rio Yokota Lab.
School of Computing, Tokyo Institute of Technology

Adviser : Rio Yokota
Topics : High Performance Computing, Machine Learning

Sep. 2018 ~ Oct. 2018

Sep. 2018 ~ Oct. 2018

Research Internship at RIKEN - Center for Advanced Intelligence Project.
Adviser : Taiji Suzuki
Topics : Deep Learning Theory, Stochastic Optimization, Statistical Learning Theory

Apr. 2017 ~ Mar. 2018

Apr. 2017 ~ Mar. 2018

Research Internship at IBM Research Tokyo.
Adviser : Taro Sekiyama and Kiyokuni Kawachiya
Topics : Distributed Deep Learning, Optimization of Deep Learning Framework

Apr. 2017 ~ Mar. 2019

Apr. 2017 ~ Mar. 2019

Mater of Engineering, Belonging to Rio Yokota Lab.
Department of Computer Science, Tokyo Institute of Technology

Adviser : Rio Yokota
Topics : High Performance Computing, Machine Learning
Other : Summa Cum Laude (Graduate first on the list)

Feb. 2017 ~ Mar. 2017

Feb. 2017 ~ Mar. 2017

Internship at Mercari INC
Topics : User hearing and improvement proposal of Mercari and CtoC service

Aug. 2016 ~ Dec. 2018

Aug. 2016 ~ Dec. 2018

Internship at SORACOM INC
Topics : Developed an internal cloud computing system using Apache Spark

Dec. 2015 ~ Jan. 2016

Dec. 2015 ~ Jan. 2016

Internship at Nomura Research Institute, Ltd
Topics : Developed an internal system and consulting work based on data analysis

Aug. 2015 ~ Dec. 2015

Aug. 2015 ~ Dec. 2015

Internship at CyberAgent, Inc.
Topics : Developed an Android application on abemaTV Fresh development team in Internet TVservice AbemaTV

Aug. 2015 ~ Oct. 2015

Aug. 2015 ~ Oct. 2015

Internship at Team Lab INC.
Topics : Developed an internal systems and some digital arts and using Java, Ruby, C++, OpenGL

Aug. 2015 ~ Sep. 2015

Aug. 2015 ~ Sep. 2015

Internship at Livesense Inc.
Topics : Developed a web application that can generate a list of projects optimized for image selection by image analysis at the recruitment site

Apr. 2013 ~ Mar. 2017

Apr. 2013 ~ Mar. 2017

Bachelor of Engineering
Department of Computer Science, Tokyo Institute of Technology

Adviser : Rio Yokota and Mayuko Nakamaru
Topics : High Performance Computing, Machine Learning, Simulation
Other : Incentive award of director of School of Computing Tokyo Institute of Technology

Apr. 2010 ~ Mar. 2013

Apr. 2010 ~ Mar. 2013

Chuo University Suginami High School
Other : Tokyo Metropolitan Governor Award (Graduate first on the list)

Nov. 1994

Nov. 1994

Born in Fukuoka Pref, Japan

Research

Research Interests

  • Large Scale Distributed Data Parallel Deep Learning
  • Statistical Learning Theory
  • Non-Convex Optimization
  • Information Geometry
  • Gaussian Process

Smoothing of the Objective Function in Stochastic Optimization for Large Scale Parallel Deep Learning

keywords : large scale deep learning / non-convex optimization / smoothing loss function / second-order optimization

Classical learning theory states that when the number of parameters of the model is too large compared to the data, the model will overfit and the generalization performance deteriorates. However, it has been empirically shown that deep neural networks (DNN) can achieve high generalization capability by training with extremely large amount of data and model parameters, which exceeds the predictions of classical learning theory. One drawback of this is that training of DNN requires enormous calculation time. Therefore, it is necessary to reduce the training time through large scale parallelization. Straightforward data-parallelization of DNN degrades convergence and generalization. In the present work, we investigate the possibility of using second order methods to solve this generalization gap in large-batch training. This is motivated by our observation that each mini-batch becomes more statistically stable, and thus the effect of considering the curvature plays a more important role in large-batch training. We have also found that naively adapting the natural gradient method causes the generalization performance to deteriorate further due to the lack of regularization capability. We propose an improved second order method by smoothing the loss function, which allows second order methods to generalize as well as mini-batch SGD.

Verification of Speeding Up Using Low Precision Arithmetic in Convolutional Neural Network

keywords : image recognition / convolutional neural network / low-precision / half-precision / quantization

The recent trend in convolutional neural networks (CNN) is to have deeper multilayered structures. While this improves the accuracy of the model, the amount of computation and the amount of data involved in learning and inference increases. In order to solve this problem, several techniques have been proposed to reduce the amount of data and the amount of computation by lowering the numerical precision of computation and data by utilizing the CNN's resistance to noise. However, there is a lack of discussion on the relationship between parameter compression and speedup within each layer of the CNN. In this research, we propose a method to speed up the inference by using half precision floating point SIMD instructions, by applying low precision to the learned model, in addition to reducing the data of the CNN model, and speeding up data access for layers that are computation-bound. We examined the influence of CNN recognition accuracy, the speedup for each layer, and its reason, when we apply our method.

Accelerating Matrix Multiplication in Deep Learning by using Low-Rank Approximation

keywords : image recognition / convolutional neural networks / low-rank approximation / tensor decomposition

In the image recognition using convolution neural networks (CNN), convolution operations occupies the majority of the computation time. In order to cope with this problem, methods which compress the dense tensors in convolution layers using low-rank approximation have been proposed to reduce the amount of computation, but these studies have not revealed the trade-off between the computational complexity reduced by low-rank approximation and the image recognition accuracy. In this research, we investigated the trade-off between the image recognition accuracy and speed-up rate for the method proposed by Peisong Wang et al. on GPU.

Membership

  • Association for Computing Machinery (ACM)
  • The Japanese Society for Artificial Intelligence (JSAI)
  • Information Processing Society of Japan (IPSJ)
  • The Japan Society for Computational Engineering and Science (JSCES)
  • The Institute of Electronics, Information and Communication Engineers (IEICE)

Scholarship and Grand-in-Aids


Publication

Google Scholoar, ORCID

International Conference

  1. Hiroki Naganuma, Rio Yokota, "A Performance Improvement Approach for Second-Order Optimization in Large Mini-batch Training", 2nd High Performance Machine Learning Workshop held in conjunction with CCGrid2019 (HPML2019), 2019. [paper/english] [slide/english]
  2. Minatsu Sugimoto, Eitaro Yamatsuta, Hiroki Naganuma, Riku Arakawa, Yudai Ushio, Masayuki Teramoto, ”Design of smart pacifier detecting dehydration symptoms in baby’s and sharing parents behavior”, Tsukuba Global Science Week 2018 (TGSW2018), 2018. [Excellent Poster Award] [link/english] [paper/english]
  3. Hiroki Naganuma, Rio Yokota, "Accelerating Convolutional Neural Networks Using Low Precision Arithmetic", HPC Asia 2018 Poster Session, 2018. [paper/english] [poster/english]
  4. Kazuki Osawa, Akira Sekiya, Hiroki Naganuma, and Rio Yokota, "Accelerating Matrix Multiplication in Deep Learning by using Low-Rank Approximation", The 2017 International Conference on High Performance Computing & Simulation(HPCS 2017), 2017. [paper/english]

Domestic Conference and Symposium

  1. Hiroki Naganuma, Rio Yokota, "A Preliminary Study of Adaptive Learning Rate for Stochastic Optimization in Convolutional Neural Network", CREST 3 AI Projects Joint Meeting 2019, 2019.
  2. Hiroki Naganuma, "Investigation of Second-Order Optimization in Large Mini-batch Training", INTERNATIONAL HPC SUMMER SCHOOL 2019 (IHPCSS2019), 2019. [link/english]
  3. Hiroki Naganuma, Rio Yokota, "Investigation of Smoothing in Natural Gradient Method for Large Mini-batch Training", The 3rd cross-disciplinary Workshop on Computing Systems, Infrastructures, and Programming (xSig2019), 2019. [Outstanding Presentation Award] [link/english]
  4. Hiroki Naganuma, Shun Iwase, Rio Yokota, "Verification of the Reducing the Number of Iterations in Large Mini-Batch Training by Applying Mixup", The 3rd cross-disciplinary Workshop on Computing Systems, Infrastructures, and Programming (xSig2019), 2019. [link/japanese]
  5. Hiroki Naganuma, Rio Yokota, "Noise Injection Leads to Better Generalization in Large Mini-Batch Training", Tokyo Institute of Technology and Stony Brook University Joint Science and Technology Meeting, 2019.
  6. Hiroki Naganuma, Rio Yokota, "A Study on Generalization Performance Improvement Method on Large Batch Learning Using Averaging by Noise Injection", The Institute of Electronics Information and Communication Engineers General Conference 2019, 2019. [Information and Systems Society Poster Award (Acceptance Rate 6.3%)] [link/japanese] [link/japanese]
  7. Hiroki Naganuma, Rio Yokota, "Smoothing of Objective Function for Large Scale Parallel Deep Learning", The 81st National Convention of Information Processing Society of Japan, 2019. [link/japanese]
  8. Hiroki Naganuma, Rio Yokota, "Smoothing of the Objective Function in Stochastic Optimization for Large Scale Parallel Learning", CREST-Deep Symposium, 2018. [link/english]
  9. Hiroki Naganuma, Shun Iwase, Kinsho Kaku, Hikaru Nakata, Rio Yokota, ”Hyperparameter Optimization of Large Scale Parallel Deep Learning using Natural Gradient Approximation Method”, Forum for Information and Technology 2018 (FIT2018), 2018. [link/japanese] [link/english]
  10. Hiroki Naganuma, Akira Sekiya, Kazuki Osawa, Hiroyuki Ootomo, Yuji Kuwamura, Rio Yokota, "Verification of speeding up using low precision arithmetic in convolutional neural network", GTC Japan 2017 Poster Session, 2017. [link/japanese]
  11. Hiroki Naganuma, Akira Sekiya, Kazuki Osawa, Hiroyuki Ootomo, Yuji Kuwamura, Rio Yokota, ”Improvement of speed using low precision arithmetic in deep learning and performance evaluation of accelerator", Technical Committee on Pattern Recognition and Media Understanding (PRMU), 2017. [link/english] [link/english]
  12. Kazuki Osawa, Akira Sekiya, Hiroki Naganuma, Rio Yokota, "Accelerating Convolutional Neural Networks Using Low-Rank Tensor Decomposition", Technical Committee on Pattern Recognition and Media Understanding (PRMU), 2017. [link/english] [link/english]
  13. Hiroki Naganuma, Kazuki Osawa, Akira Sekiya, Rio Yokota, "Acceleration of compression model using half-precision arithmetic in deep learning", Japan Society for Industrial and Applied Mathematic (JSIAM2017), 2017. [link/japanese] [link/english]
  14. Kazuki Osawa, Akira Sekiya, Hiroki Naganuma, Rio Yokota, "Accelerating Convolutional Neural Networks using Low-Rank Approximation", Proceedings of the Conference on Computational Engineering and Science Vol. 22, 2017. [link/english] [link/english]
  15. Akira Sekiya, Kazuki Osawa, Hiroki Naganuma, Rio Yokota , "Acceleration of matrix multiplication of deep learning using low rank approximation", 158th Research Presentation Seminar in High Performance Computing, 2017. [link/english] [paper/japanese]

Work

Astral Body (Mar. 2018)

This work searches for the very essence of life by comparning and expressing "the signs of life" and "physical forms" in living creatures.

Your Pacifier (Nov. 2017)

"YourPacifier" is a smart pacifier to support your children and you.

Walky (Nov. 2016)

Smart white-cane device which tells users what and where obstacles are.

HoloChat (Jun. 2016)

Real-time telecommunication system using pseudo hologram. Detailed implementation is described in Github.

TEDxUTokyo Movie Production (May. 2016)

Stages movie direction and productions of TEDxUTokyo2016 and real-time SNS analysis for visualization.

Musashino Art University Arts Festival Projection Mapping Project (Oct. 2015)

At the Musashino Art University Art Festival, I conducted projection mapping planning (interactive planning) as a leader and engineer.

Others

Skills

C / C++ / OpenMP / CUDA / MPI / Java / Apache Spark / Android / OpenGL / OpenCV / openFrameworks / Python / Caffe / Caffe2 / Chainer / ChainerMN / TensorFlow / PyTorch

Major Award

  1. Complete Master Course with Summa Cum Laude (Graduate First on the List) (Mar. 2019)
  2. James Dyson Award 2018 National Winner (Acceptance Rate 2%) (Dec. 2018)
  3. Art Hack Day 2018 Grand Prize (the 1st in 12 teams) (Mar. 2018)
  4. 2025 Japan World Expo Committee Creative Competition Digital Creative Department First Prize (Acceptance Rate 3.5%) (Nov. 2017)
  5. Stanford Health Hackathon 2017 3rd Prize (the 3rd in 52 teams) (Oct. 2017)
  6. Imagine Cup 2017 Japan qualifying First Prize (Mar. 2017)
  7. Incentive award of director of School of Computing Tokyo Institute of Technology (Mar. 2017)
  8. MashupAward-2016 Japan’s Largest Hackathon Student Division First Prize (Nov. 2016)
  9. SC16 HPC for Undergraduates Program First Selection for Japanese(Nov. 2016)
  10. JPHACKS-2016 Japan’s Largest Student Hackathon First Prize (the 1st in 89 teams) (Oct. 2016)
  11. Tokyo Metropolitan Governor Award (Graduate First on the List) (Mar. 2013)

Other Award

  1. Kitakyushu Digital Creator Contest2019 Nominate (Mar. 2019)
  2. The 24th Campus Genius Contest Nominate (Oct. 2018)
  3. The 4th Whole Brain Architecture Hackathon Outstanding Performance Award (Oct. 2018)
  4. MashupAward2017 Business Egg Division Finalist (Nov. 2017)
  5. MashupAward2017 Osaka qualifying 2nd Prize (Oct. 2017)
  6. Stanford Health Hackathon 2017 Persistent-Neodesign $1k prizes (Oct. 2017)
  7. Sony Hackathon 2017 First Prize (Feb. 2017)
  8. Tokyo Institute of Technology Engineering Design Competition First Prize (Feb. 2017)
  9. JPHACKS-2016 Japan’s Largest Student Hackathon Tokyo qualifying First Prize, Mitsubishi UFJ Securities Holdings Co.,Ltd. Award, ImagineCup Award, MashupAward's Award, Softbank Award, AbemaTV Award, Best Idea Award, Innovator Certification (Nov. 2016)
  10. Android Library Hackathon 2016 sponserd by CyberAgent 2nd Prize (Nov. 2016)
  11. HackU@Tokyo-Tech 2016 sponserd by Yahoo! Japan Tokyo-Tech Prize (Jun. 2016)
  12. Sony Hackathon 2016 First Prize (Feb. 2016)
  13. IT Draft 2016 first-round draft pick from DeNA (Jan. 2016)
  14. JPHACKS-2015 Japan’s Largest Student Hackathon NTT Communications Award (Nov. 2015)
  15. The first Git Challenge 2015 sponserd by Mixi 3rd Prize (Nov. 2015)
  16. Android Library Hackathon 2015 sponserd by CyberAgent 2nd Prize (Nov. 2015)
  17. JINS MEME DEVELOPER IDEA PITCH CONTEST Winning a Prize (Nov. 2015)
  18. Life is Tech! All Star Hackathon First Prize (Mar. 2014)
  19. Nikkei Education Challenge Report Contest Lecturer's Special Award (Aug. 2012)

Community

© Hiroki Naganuma | This template is made with by Colorlib