Michele Merler

Senior Research Scientist

IBM Research AI

mimerler at us dot ibm dot com

Ciao! I am a Senior Research Scientist at IBM Research AI. Since 2012 I have been at the T. J. Watson Research Lab in New York. Before that, I did my PhD at Columbia University with Professor John Kender. My interests are in Multimedia and applications of Deep Learning in Computer Vision and NLP, with a focus on neural architecture search for vision and language models, and sports video analysis. I have publications in several peer-reviewed journals and conferences, including IEEE TMM, CVPR, ACM Multimedia, AAAI, ICMR, MICCAI, etc. Among other professional activities, I have served as an Associate Editor for the IEEE Transactions on Multimedia (2021-2023), as Area Chair for ECCV in 2024 and for ACM Multimedia in 2016 and 2017, and local organization chair and web chair for ICMR in 2016. My work has been recognized in the popular press (including New York Times, Fortune, NBC News) and I have been fortunate to win some awards, including the 2023 Tech Emmys. You can look at my CV here.


Interests

  • Multimedia
  • Deep (and shallow) Learning
  • Computer Vision
  • NLP Applications
  • AI for code

Education

  • PhD in Computer Science, 2013

    Columbia University

  • MS in Computer Science, 2008

    Columbia University

  • MEng in Telecommunications Engineering, 2007

    University of Trento, Italy

NEWS

- We won Best Paper Award at IEEE ICDH! paper July 2024
- The (Computer) Vision of Sports book chapter is out! June 2024
- Granite Code Models paper and models are released opensource! May 2024
- Code Lingua leaderboard is out! April 2024
- 10th Workshop on Computer Vision in Sports (CVSports) @CVPR 2024
- I am serving as Area Chair for ECCV 2024
- Code Lingua paper on evaluating CodeLLMs for translation accepted at ICSE 2024
- We helped IBM win a Tech Emmy Award for AI-ML curation of Sports Highlights @EMMYS 2023!
- 9th Workshop on Computer Vision in Sports (CVSports) @CVPR 2023
- 4th Workshop on Fair, Data Efficient and Trusted Computer Vision (FA.DE.TR.CV) @CVPR 2023
- 8th Workshop on Computer Vision in Sports (CVSports) @CVPR 2022
- Nominated Outstanding Reviewer @CVPR 2021
- I have been appointed Associate Editor for IEEE Transactions on Multimedia (TMM) (2021-2023)!
- 7th Workshop on Computer Vision in Sports (CVSports) @CVPR 2021
- 2nd Workshop on Fair, Data Efficient and Trusted Computer Vision (FA.DE.TR.CV) @CVPR 2021
- NASTransfer paper accepted to AAAI (Feb 2021)
- 6th Workshop on Computer Vision in Sports (CVSports) @CVPR 2020
- Workshop on Fair, Data Efficient and Trusted Computer Vision (FA.DE.TR.CV) @CVPR 2020
- 2nd Workshop on Bias Estimation in Face Analytics (BEFA) @CVPR 2019
- The Diversity in Faces dataset is out. Check it out here (Jan 2019)
- AI Self Portrait in the art gallery of the NIPS Workshop on Machine Learning for Creativity (Dec 2018)
- Our AI Self Portait has been published in the New York Times! (Oct 2018)
- Cognitive Highlights work accepted to IEEE TMM (Sep 2018)
- Cognitive Highlights wins 2018 Best Digital Development at the Yahoo Sports Tech Awards for Wimbledon!
- Workshop on Bias Estimation in Face Analytics (BEFA) @ECCV 2018
- New Face Attributes models coming to WDC Visual Recognition mitigating bias (Feb 2018)
- Cognitive Highlights @US Open! (Sep 2017)
- Presenting Cognitive Highlights (@Wimbledon July 2017) in the CV in Sports Workshop @CVPR 2017
- Food recognition model launched in beta in WDC Visual Recognition (May 2017)
- Cognitive Highlights @Golf Masters! (Apr 2017)
- I am Area Chair for ACM Multimedia 2017 in Multimedia Search and Recommendation track
- Presenting our delicious Food analytics papers @ACM Multimedia 2016
- I am Local Organization Chair and Webmaster for ICMR 2016
- I acted as a Guest Editor for the Neurocomputing Special Issue on Advanced Learning for Large-Scale Heterogeneous Computing 2016
- I am Area Chair for ACM Multimedia 2016 in Multimedia Search and Recommendation track
- Received the outstanding reviewer award from the 2015 International Conference on Multimedia Retrieval (ICMR)

Research Projects

2023-2024

2022-2023

2022-2023

2020-2021

2016

Fun Projects

Selected Publications

2024

  • Carla Agurto, Michele Merler, Esteban Roitberg, Alan Taitz, Marcos A. Trevisan, Diego E. Shalom, Julian Peller, Lyle W. Ostrow, Indu Navar, Ernest Fraenkel, James Berry, Guillermo A. Cecchi and Raquel Norel. Harnessing Remote Speech Tasks for Early ALS Biomarker Identification. IEEE International Conference on Digital Health (ICDH) 2024. PDF BibTeX

  • Mayank Mishra, Matt Stallone, Gaoyuan Zhang, Yikang Shen, Aditya Prasad, Adriana Meza Soria, Michele Merler, Parameswaran Selvam, Saptha Surendran, Shivdeep Singh, Manish Sethi, Xuan-Hong Dang, Pengyuan Li, Kun-Lung Wu, Syed Zawad, Andrew Coleman, Matthew White, Mark Lewis, Raju Pavuluri, Yan Koyfman, Boris Lublinsky, Maximilien de Bayser, Ibrahim Abdelaziz, Kinjal Basu, Mayank Agarwal, Yi Zhou, Chris Johnson, Aanchal Goyal, Hima Patel, Yousaf Shah, Petros Zerfos, Heiko Ludwig, Asim Munawar, Maxwell Crouse, Pavan Kapanipathi, Shweta Salaria, Bob Calio, Sophia Wen, Seetharami Seelam, Brian Belgodere, Carlos Fonseca, Amith Singhee, Nirmit Desai, David D. Cox, Ruchir Puri, Rameswar Panda. Granite Code Models: A Family of Open Foundation Models for Code Intelligence. arXiv (arXiv) 2024. arXiv BibTeX

  • Rangeet Pan, Ali Reza Ibrahimzada, Rahul Krishna, Divya Sankar, Lambert Pouguem Wassi, Michele Merler, Boris Sobolev, Raju Pavuluri, Saurabh Sinha and Reyhaneh Jabbarvand. Lost in translation: A study of bugs introduced by large language models while translating code. International Conference on Sofware Engineering (ICSE) 2024. arXiv BibTeX

  • Rikke Gade, Michele Merler, Graham Thomas and Thomas B Moeslund. The (Computer) Vision of Sports: Recent Trends in Research and Commercial Systems for Sport Analytics. Computer Vision: Challenges, Trends, and Opportunities (CRC Press) 2024. book preview BibTeX

2023

  • Masayasu Muraoka, Bishwaranjan Bhattacharjee, Michele Merler, Graeme Blackwood, Yulong Li, and Yang Zhao. Cross-Lingual Transfer of Large Language Model by Visually-Derived Supervision Toward Low-Resource Languages. ACM Multimedia (MM ) 2023. PDF BibTeX

  • Takuma Udagawa, Aashka Trivedi, Michele Merler and Bishwaranjan Bhattacharjee. A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models . EMNLP Industry Track (EMNLP) 2023. arXiv BibTeX

  • Jiaqing Yuan, Michele Merler, Mihir Choudhury, Raju Pavuluri, Munindar P. Singh and Maja Vukovic. CoSiNES: Contrastive Siamese Network for Entity Standardization . ACL Matching Workshop (ACLW) 2023. arXiv code BibTeX

2022

  • Aashka Trivedi, Takuma Udagawa, Michele Merler, Rameswar Panda, Yousef El-Kurdi and Bishwaranjan Bhattacharjee. Neural Architecture Search for Effective Teacher-Student Knowledge Transfer in Language Models. arXiv (arXiv) 2022. arXiv BibTeX

2021

  • Rameswar Panda, Michele Merler, Mayoore Jaiswal, Hui Wu, Kandan Ramakrishnan, Ulrich Finkler, Chun-Fu Chen, Minsik Cho, David Kung, Rogerio S Feris, Bishwaranjan Bhattacharjee. NASTransfer: Analyzing Architecture Transferability in Large Scale Neural Architecture Search. 35th AAAI Conference on Artificial Intelligence (AAAI) 2021. arXiv BibTeX

2020

  • Ulrich Finkler, Michele Merler, Rameswar Panda, Mayoore S Jaiswal, Hui Wu, Kandan Ramakrishnan, Chun-Fu Chen, Minsik Cho, David Kung, Rogerio Feris, Bishwaranjan Bhattacharjee. Large Scale Neural Architecture Search with Polyharmonic Splines. arXiv (arXiv) 2020. arXiv BibTeX

  • Michele Merler, Cicero Nogueira dos Santos, Mauro Martino, Alfio M Gliozzo, John R Smith. Covering the News with (AI) Style. arXiv (arXiv) 2020. arXiv BibTeX

2019

  • Michele Merler, Nalini Ratha, Rogerio S Feris, John R Smith. Diversity in Faces . arXiv (arXiv) 2019. arXiv BibTeX Project

2018

  • Michele Merler, Dhiraj Joshi, Quoc-Bao Nguyen, Stephen Hammer, John Kent, Jinjun Xiong, Minh N. Do, John R Smith, Rogerio S Feris. Automatic Curation of Sports Highlights using Multimodal Excitement Features. IEEE Transactions on MultiMedia (TMM) 2018. PDF BibTeX Project

2017

  • Michele Merler, Dhiraj Joshi, Quoc-Bao Nguyen, Stephen Hammer, John Kent, John R Smith, Rogerio S Feris. Automatic Curation of Golf Highlights using Multimodal Excitement Features. 3rd Workshop of Computer Vision in Sports @CVPR (CVPRW) 2017. PDF BibTeX Slides Project

  • Dhiraj Joshi, Michele Merler , Quoc-Bao Nguyen, Stephen Hammer, John Kent, John R Smith, Rogerio S Feris. IBM High-Five: Highlights From Intelligent Video Engine. ACM Multimedia (MM) 2017. PDF BibTeX Project

  • Xiaolong Wang, Guodong Guo, Michele Merler, Noel CF Codella, MV Rohith, John R Smith, Chandra Kambhamettu. Leveraging multiple cues for recognizing family photos. Image and Vision Computing(IVC) 2017. PDF BibTeX

2016

  • Michele Merler, Hui Wu, Rosario Uceda-Sosa, Quoc-Bao Nguyen, John R Smith. Snap, Eat, RepEat: a food recognition engine for dietary logging. 2nd International Workshop on Multimedia Assisted Dietary Management @ACM Multimedia (MADIMA) 2016. PDF BibTeX Project Slides Poster

  • Hui Wu, Michele Merler, Rosario Uceda-Sosa, John R Smith. Learning to make better mistakes: Semantics-aware visual food recognition. ACM Multimedia (MM) 2016. PDF BibTeX Project

2015

  • Michele Merler, Liangliang Cao, John R Smith. You are what you tweet… pic! gender prediction based on semantic analysis of social media images. IEEE International on Conference on Multimedia and Expo (ICME) 2015. PDF BibTeX Slides

  • Junjie Cai, Michele Merler, Sharath Pankanti, Qi Tian. Heterogeneous semantic level features fusion for action recognition. IEEE International on Conference on Multimedia Retrieval (ICMR) 2015. PDF BibTeX

  • Mani Abedini, Noel CF Codella, Jonathan H Connell, Rahil Garnavi, Michele Merler, Sharath Pankanti, John R Smith, Tanveer Syeda-Mahmood A generalized framework for medical image classification and recognition. IBM Journal of Research and Development(IBM-JRD) 2015. PDF BibTeX

2014

  • Felix X Yu, Liangliang Cao, Michele Merler, Noel Codella, Tao Chen, John R Smith, Shih-Fu Chang. Modeling attributes from category-attribute proportions. ACM Multimedia (MM) 2014. PDF BibTeX

  • Noel Codella, Jonathan Connell, Sharath Pankanti, Michele Merler, John R Smith. Automated medical image modality recognition by fusion of visual and text information. International Conference on Medical Image Computing and Computer-Assisted Intervention(MICCAI) 2014. PDF BibTeX CLEF13 Slides

PRE-2013

  • Michele Merler, Bert Huang, Lexing Xie, Gang Hua, Apostol Natsev. Semantic model vectors for complex video event recognition. IEEE Transactions on Multimedia (TMM) 2012. PDF BibTeX Project

  • Michele Merler, Rong Yan, John R Smith. Imbalanced rankboost for efficiently ranking large-scale image/video collections. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2009. PDF BibTeX Poster

  • Rong Yan, Marc-Olivier Fleury, Michele Merler, Apostol Natsev, John R. Smith. Large-Scale Multimedia Semantic Concept Modeling using Robust Subspace Bagging and MapReduce. First ACM workshop on Large-scale multimedia retrieval and mining @ACM Multimedia (LS-MMRM) 2009. PDF BibTeX

  • Michele Merler, John R Kender. Semantic keyword extraction via adaptive text binarization of unstructured unsourced video. IEEE International Conference on Image Processing (ICIP) 2009. PDF BibTeX Poster

  • Michele Merler, Carolina Galleguillos, Serge Belongie. Recognizing groceries in situ using in vitro training data. 2nd International Workshop on Semantic Learning Applications in Multimedia @CVPR (SLAM) 2007. PDF BibTeX Project

PATENTS

  • Jiaqing Yuan, Michele Merler, Mihir Choudhury, Venkata Nagaraju Pavuluri, Maja Vukovic. Entity standardization for application modernization. US Patent App. 18/160,301 2024 GooglePatents

  • Michele Merler, Paul Pritz. Attribute-based calibration for machine learning. US Patent App. 17/977,880 2024 GooglePatents

  • Anup Kalia, Mihir Choudhury, Jin Xiao, Divya Sankar, John Rofrano, Venkata Nagaraju Pavuluri, Lambert Pouguem Wassi, Maja Vukovic, Michele Merler. Adaptable and explainable application modernization disposition. US Patent App. 18/071,911 2024 GooglePatents

  • Michele Merler, Dhiraj Joshi, Apurv Gupta, Sebastien Gilbert, Shyama Prosad Chowdhury, Chidansh Amitkumar Bhatt, Nirmit V Desai. AI System and Method for Automatic Analog Gauge Reading. US Patent App. 17/936,519 2024 GooglePatents

  • Sebastien Gilbert, Michele Merler, Dhiraj Joshi, Apurv Gupta, Shyama Prosad Chowdhury, Chidansh Amitkumar Bhatt, Nirmit V Desai. Oblique Image Rectification. US Patent App. 18/048,975 2024 FreePatentsOnline

  • Dinesh C Verma, Franck Vinh Le, Michele Merler, Dhiraj Joshi, Supriyo Chakraborty, Seraphin Bernard Calo. Knowledge expansion for improving machine learning. US Patent App. 17/935,198 2024 GooglePatents

  • Raghu Kiran Ganti, Mudhakar Srivatsa, Shreeranjani Srirangamsridharan, Jae-Wook Ahn, Michele Merler, Dean Steuer. Transparent and controllable topic modeling. US Patent 11,941,038 2024 GooglePatents

  • Michele Merler, Aashka Trivedi, Rameswar Panda, Bishwaranjan Bhattacharjee, Taesun Moon, Avirup Sil. Neural architecture search of language models using knowledge distillation. US Patent App. 17/075,963 2022 GooglePatents

  • Ulrich Alfons Finkler, Michele Merler, Mayoore Selvarasa Jaiswal, Hui Wu, Rameswar Panda, Wei Zhang. Configuring a neural network using smoothing splines. US Patent App. 17/075,963 2022 GooglePatents

  • Michele Merler, Mauro Martino, Cicero Nogueira dos Santos, Alfio Massimiliano Gliozzo, John R. Smith. Automatic generation of content using multimedia. US Patent 11,170,270 2021 GooglePatents

  • Michele Merler, Dhiraj Joshi, Quoc-Bao Nguyen, Stephen C Hammer, John Joseph Kent, John R Smith, Rogerio Feris. Auto-curation and personalization of sports highlights. US Patent 10,595,101 2020 GooglePatents

  • Michele Merler, Jae-Eun Park, John R Smith, Rosario Uceda-Sosa. Individual and user group attributes discovery and comparison from social media visual content. US Patent 10,282,677 2019 GooglePatents

  • Michele Merler, John R Smith, Rosario Uceda-Sosa, Hui Wu. Image classification utilizing semantic relationships in a classification hierarchy. US Patent 9,928,448 2018 GooglePatents

  • Liangliang Cao, Michele Merler, John R Smith. Systems and methods for inferring gender by fusion of multimodal content. US Patent 9,684,852 2017 GooglePatents

  • Michele Merler, John R Kender. Kalman filter approach to augment object tracking. US Patent 9,177,229 2015 GooglePatents

Professional Activities

  • Area Chair at ECCV 2024
  • Associate Editor for the IEEE Transactions on Multimedia (TMM) 2021-2023
  • Organizer for the Workshop on Computer Vision in Sports (CVSports) @CVPR 2020-2023
  • Organizer for the Workshop on Fair, Data Efficient and Trusted Computer Vision (FA.DE.TR.CV) @CVPR 2020-2023
  • Program Chair for the Second International Workshop on Bias Estimation in Face Analytics (BEFA) @CVPR 2019
  • Program Chair for the International Conference on Image Analysis and Processing (ICIAP) 2019
  • Demo Chair for the International Conference on MultiMedia Modeling (MMM) 2019
  • Program Chair for First International Workshop on Bias Estimation in Face Analytics (BEFA) @ECCV 2018
  • Area Chair at ACM Multimedia 2016, 2017. Multimedia Search and Recommendation track
  • Registration Chair at VCIP 2017
  • Guest Editor for the Neurocomputing Special Issue on Advanced Learning for Large-Scale Heterogeneous Computing 2016
  • Local Arragement Chair and Web Chair at ICMR 2016
  • Panelist at National Science Foundation (NSF), IIS Division 2015
  • Co-organizer of the Greater New York Multimedia and Vision Meeting (GNYMV) 2012-2014
  • TPC for CVPR, ECCV, ICCV, ACM Multimedia, NeurIPS, AAAI, ICMR, FG, ICME, ICPR
  • Reviewer for IEEE TMM, IEEE TPAMI, IEEE TIP, CVIU, TOMCCAP, JVCI, TCSVT, JVCIR
  • Member of IEEE, IEEE Circuit and Systems Society, New York Academy of Sciences

Honors and Awards

  • Best Paper Award for "Harnessing Remote Speech Tasks for Early ALS Biomarker Identification" (IEEE ICDH 2024)
  • Technology & Engineering Emmy Award (AI-ML curation of Sports Highlights) (2023)
  • IBM Outstanding Technical Achievement (Konveyor Open-Source Community) (2023)
  • Outstanding Reviewer (CVPR 2021)
  • IBM Corporate Award (AI Video Enrichment and Editing) (2019)
  • IBM Research Division Award (Watson Visual Recognition Services Contributions) (2018)
  • Best Digital Development from Yahoo Sports Tech Awards for Wimbledon Cognitive Highlights (2018)
  • Best Reviewer Award (ICMR 2015)
  • First Place in ImageCLEF Medical Image Modality Classification (2013)
  • IBM Research Division Award (Multimedia Semantic Modeling) (2013)
  • IBM Outstanding Technical Accomplishment (Multimedia Group) (2012)
  • IBM Eminence and Excellence Award (Greater New York Multimedia and Vision Workshop) (2012)
  • First Place in ImageCLEF Medical Image Modality Classification (2012)
  • Yahoo! Key Scientific Challenge Award (2009)
  • VideOlympics "People's Choice Award" (group) for IMARS Multimedia Retrieval System, ACM CIVR (2008)
  • California Institute of Information and Telecommunications Technology Summer Undergraduate Research Scholarship (2006)

Personal Interests

I love calcio (or footbal, or soccer). Both to watch and to play with my friends! I've made so many great memories playing in and around NYC...