Ciao! I am a Senior Research Scientist at IBM Research AI. Since 2012 I have been at the T. J. Watson Research Lab in New York. Before that, I did my PhD at Columbia University with Professor John Kender. My interests are in Multimedia and applications of Deep Learning in Computer Vision and NLP, with a focus on neural architecture search for vision and language models, and sports video analysis. I have publications in several peer-reviewed journals and conferences, including IEEE TMM, CVPR, ACM Multimedia, AAAI, ICMR, MICCAI, etc. Among other professional activities, I have served as an Associate Editor for the IEEE Transactions on Multimedia (2021-2023), as Area Chair for ACM Multimedia in 2016 and 2017, and local organization chair and web chair for ICMR in 2016. My work has been recognized in the popular press (including New York Times, Fortune, NBC News) and I have been fortunate to win some awards, including the 2023 Tech Emmys. You can look at my CV here.
PhD in Computer Science, 2013
Columbia University
MS in Computer Science, 2008
Columbia University
MEng in Telecommunications Engineering, 2007
University of Trento, Italy
- 9th Workshop on Computer Vision in Sports (CVSports) @CVPR 2023
- 4th Workshop on Fair, Data Efficient and Trusted Computer Vision (FA.DE.TR.CV) @CVPR 2023
- We helped IBM win a Tech Emmy Award for AI-ML curation of Sports Highlights @EMMYS 2023!
- 8th Workshop on Computer Vision in Sports (CVSports) @CVPR 2022
- Nominated Outstanding Reviewer @CVPR 2021
- I have been appointed Associate Editor for IEEE Transactions on Multimedia (TMM) (2021-2023)!
- 7th Workshop on Computer Vision in Sports (CVSports) @CVPR 2021
- 2nd Workshop on Fair, Data Efficient and Trusted Computer Vision (FA.DE.TR.CV) @CVPR 2021
- NASTransfer paper accepted to AAAI (Feb 2021)
- 6th Workshop on Computer Vision in Sports (CVSports) @CVPR 2020
- Workshop on Fair, Data Efficient and Trusted Computer Vision (FA.DE.TR.CV) @CVPR 2020
- 2nd Workshop on Bias Estimation in Face Analytics (BEFA) @CVPR 2019
- The Diversity in Faces dataset is out. Check it out here (Jan 2019)
- AI Self Portrait in the art gallery of the NIPS Workshop on Machine Learning for Creativity (Dec 2018)
- Our AI Self Portait has been published in the New York Times! (Oct 2018)
- Cognitive Highlights work accepted to IEEE TMM (Sep 2018)
- Cognitive Highlights wins 2018 Best Digital Development at the Yahoo Sports Tech Awards for Wimbledon!
- Workshop on Bias Estimation in Face Analytics (BEFA) @ECCV 2018
- New Face Attributes models coming to WDC Visual Recognition mitigating bias (Feb 2018)
- Cognitive Highlights @US Open! (Sep 2017)
- Presenting Cognitive Highlights (@Wimbledon July 2017) in the CV in Sports Workshop @CVPR 2017
- Food recognition model launched in beta in WDC Visual Recognition (May 2017)
- Cognitive Highlights @Golf Masters! (Apr 2017)
- I am Area Chair for ACM Multimedia 2017 in Multimedia Search and Recommendation track
- Presenting our delicious Food analytics papers @ACM Multimedia 2016
- I am Local Organization Chair and Webmaster for ICMR 2016
- I acted as a Guest Editor for the Neurocomputing Special Issue on Advanced Learning for Large-Scale Heterogeneous Computing 2016
- I am Area Chair for ACM Multimedia 2016 in Multimedia Search and Recommendation track
- Received the outstanding reviewer award from the 2015 International Conference on Multimedia Retrieval (ICMR)
Masayasu Muraoka, Bishwaranjan Bhattacharjee, Michele Merler, Graeme Blackwood, Yulong Li, and Yang Zhao. Cross-Lingual Transfer of Large Language Model by Visually-Derived Supervision Toward Low-Resource Languages. ACM Multimedia (MM ) 2023. PDF BibTeX
Rangeet Pan, Ali Reza Ibrahimzada, Rahul Krishna, Divya Sankar, Lambert Pouguem Wassi, Michele Merler, Boris Sobolev, Raju Pavuluri, Saurabh Sinha and Reyhaneh Jabbarvand. Understanding the Effectiveness of Large Language Models in Code Translation. arXiv (arXiv) 2023. arXiv BibTeX
Takuma Udagawa, Aashka Trivedi, Michele Merler and Bishwaranjan Bhattacharjee. A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models . EMNLP Industry Track (EMNLP) 2023. arXiv BibTeX
Jiaqing Yuan, Michele Merler, Mihir Choudhury, Raju Pavuluri, Munindar P. Singh and Maja Vukovic. CoSiNES: Contrastive Siamese Network for Entity Standardization . ACL Matching Workshop (ACLW) 2023. arXiv code BibTeX
Aashka Trivedi, Takuma Udagawa, Michele Merler, Rameswar Panda, Yousef El-Kurdi and Bishwaranjan Bhattacharjee. Neural Architecture Search for Effective Teacher-Student Knowledge Transfer in Language Models. arXiv (arXiv) 2022. arXiv BibTeX
Rameswar Panda, Michele Merler, Mayoore Jaiswal, Hui Wu, Kandan Ramakrishnan, Ulrich Finkler, Chun-Fu Chen, Minsik Cho, David Kung, Rogerio S Feris, Bishwaranjan Bhattacharjee. NASTransfer: Analyzing Architecture Transferability in Large Scale Neural Architecture Search. 35th AAAI Conference on Artificial Intelligence (AAAI) 2021. arXiv BibTeX
Ulrich Finkler, Michele Merler, Rameswar Panda, Mayoore S Jaiswal, Hui Wu, Kandan Ramakrishnan, Chun-Fu Chen, Minsik Cho, David Kung, Rogerio Feris, Bishwaranjan Bhattacharjee. Large Scale Neural Architecture Search with Polyharmonic Splines. arXiv (arXiv) 2020. arXiv BibTeX
Michele Merler, Cicero Nogueira dos Santos, Mauro Martino, Alfio M Gliozzo, John R Smith. Covering the News with (AI) Style. arXiv (arXiv) 2020. arXiv BibTeX
Michele Merler, Nalini Ratha, Rogerio S Feris, John R Smith. Diversity in Faces . arXiv (arXiv) 2019. arXiv BibTeX Project
Michele Merler, Dhiraj Joshi, Quoc-Bao Nguyen, Stephen Hammer, John Kent, Jinjun Xiong, Minh N. Do, John R Smith, Rogerio S Feris. Automatic Curation of Sports Highlights using Multimodal Excitement Features. IEEE Transactions on MultiMedia (TMM) 2018. PDF BibTeX Project
Michele Merler, Dhiraj Joshi, Quoc-Bao Nguyen, Stephen Hammer, John Kent, John R Smith, Rogerio S Feris. Automatic Curation of Golf Highlights using Multimodal Excitement Features. 3rd Workshop of Computer Vision in Sports @CVPR (CVPRW) 2017. PDF BibTeX Slides Project
Dhiraj Joshi, Michele Merler , Quoc-Bao Nguyen, Stephen Hammer, John Kent, John R Smith, Rogerio S Feris. IBM High-Five: Highlights From Intelligent Video Engine. ACM Multimedia (MM) 2017. PDF BibTeX Project
Xiaolong Wang, Guodong Guo, Michele Merler, Noel CF Codella, MV Rohith, John R Smith, Chandra Kambhamettu. Leveraging multiple cues for recognizing family photos. Image and Vision Computing(IVC) 2017. PDF BibTeX
Michele Merler, Hui Wu, Rosario Uceda-Sosa, Quoc-Bao Nguyen, John R Smith. Snap, Eat, RepEat: a food recognition engine for dietary logging. 2nd International Workshop on Multimedia Assisted Dietary Management @ACM Multimedia (MADIMA) 2016. PDF BibTeX Project Slides Poster
Hui Wu, Michele Merler, Rosario Uceda-Sosa, John R Smith. Learning to make better mistakes: Semantics-aware visual food recognition. ACM Multimedia (MM) 2016. PDF BibTeX Project
Michele Merler, Liangliang Cao, John R Smith. You are what you tweet… pic! gender prediction based on semantic analysis of social media images. IEEE International on Conference on Multimedia and Expo (ICME) 2015. PDF BibTeX Slides
Junjie Cai, Michele Merler, Sharath Pankanti, Qi Tian. Heterogeneous semantic level features fusion for action recognition. IEEE International on Conference on Multimedia Retrieval (ICMR) 2015. PDF BibTeX
Mani Abedini, Noel CF Codella, Jonathan H Connell, Rahil Garnavi, Michele Merler, Sharath Pankanti, John R Smith, Tanveer Syeda-Mahmood A generalized framework for medical image classification and recognition. IBM Journal of Research and Development(IBM-JRD) 2015. PDF BibTeX
Felix X Yu, Liangliang Cao, Michele Merler, Noel Codella, Tao Chen, John R Smith, Shih-Fu Chang. Modeling attributes from category-attribute proportions. ACM Multimedia (MM) 2014. PDF BibTeX
Noel Codella, Jonathan Connell, Sharath Pankanti, Michele Merler, John R Smith. Automated medical image modality recognition by fusion of visual and text information. International Conference on Medical Image Computing and Computer-Assisted Intervention(MICCAI) 2014. PDF BibTeX CLEF13 Slides
Michele Merler, Bert Huang, Lexing Xie, Gang Hua, Apostol Natsev. Semantic model vectors for complex video event recognition. IEEE Transactions on Multimedia (TMM) 2012. PDF BibTeX Project
Michele Merler, Rong Yan, John R Smith. Imbalanced rankboost for efficiently ranking large-scale image/video collections. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2009. PDF BibTeX Poster
Rong Yan, Marc-Olivier Fleury, Michele Merler, Apostol Natsev, John R. Smith. Large-Scale Multimedia Semantic Concept Modeling using Robust Subspace Bagging and MapReduce. First ACM workshop on Large-scale multimedia retrieval and mining @ACM Multimedia (LS-MMRM) 2009. PDF BibTeX
Michele Merler, John R Kender. Semantic keyword extraction via adaptive text binarization of unstructured unsourced video. IEEE International Conference on Image Processing (ICIP) 2009. PDF BibTeX Poster
Michele Merler, Carolina Galleguillos, Serge Belongie. Recognizing groceries in situ using in vitro training data. 2nd International Workshop on Semantic Learning Applications in Multimedia @CVPR (SLAM) 2007. PDF BibTeX Project
Raghu Kiran Ganti, Mudhakar Srivatsa, Shreeranjani Srirangamsridharan, Jae-Wook Ahn, Michele Merler, Dean Steuer. Transparent and controllable topic modeling. US Patent App. 17/748,263 2023 GooglePatents
Michele Merler, Aashka Trivedi, Rameswar Panda, Bishwaranjan Bhattacharjee, Taesun Moon, Avirup Sil. Neural architecture search of language models using knowledge distillation. US Patent App. 17/075,963 2022 GooglePatents
Ulrich Alfons Finkler, Michele Merler, Mayoore Selvarasa Jaiswal, Hui Wu, Rameswar Panda, Wei Zhang. Configuring a neural network using smoothing splines. US Patent App. 17/075,963 2022 GooglePatents
Michele Merler, Mauro Martino, Cicero Nogueira dos Santos, Alfio Massimiliano Gliozzo, John R. Smith. Automatic generation of content using multimedia. US Patent 11,170,270 2021 GooglePatents
Michele Merler, Dhiraj Joshi, Quoc-Bao Nguyen, Stephen C Hammer, John Joseph Kent, John R Smith, Rogerio Feris. Auto-curation and personalization of sports highlights. US Patent 10,595,101 2020 GooglePatents
Michele Merler, Jae-Eun Park, John R Smith, Rosario Uceda-Sosa. Individual and user group attributes discovery and comparison from social media visual content. US Patent 10,282,677 2019 GooglePatents
Michele Merler, John R Smith, Rosario Uceda-Sosa, Hui Wu. Image classification utilizing semantic relationships in a classification hierarchy. US Patent 9,928,448 2018 GooglePatents
Liangliang Cao, Michele Merler, John R Smith. Systems and methods for inferring gender by fusion of multimodal content. US Patent 9,684,852 2017 GooglePatents
Michele Merler, John R Kender. Kalman filter approach to augment object tracking. US Patent 9,177,229 2015 GooglePatents