Michele Merler

Senior Research Scientist

IBM Research AI

mimerler at us dot ibm dot com

Ciao! I am a Senior Research Scientist at IBM Research AI. Since 2012 I have been at the T. J. Watson Research Lab in New York. Before that, I did my PhD at Columbia University with Professor John Kender. My interests are in Multimedia and applications of Deep Learning in Computer Vision and NLP, with a focus on neural architecture search for vision and language models, and sports video analysis. I have publications in several peer-reviewed journals and conferences, including IEEE TMM, CVPR, ACM Multimedia, AAAI, ICMR, MICCAI, etc. Among other professional activities, I have served as an Associate Editor for the IEEE Transactions on Multimedia (2021-2023), as Area Chair for ECCV in 2024 and for ACM Multimedia in 2016 and 2017, and local organization chair and web chair for ICMR in 2016. My work has been recognized in the popular press (including New York Times, Fortune, NBC News) and I have been fortunate to win some awards, including the 2023 Tech Emmys. You can look at my CV here.

Interests

Multimedia
Deep (and shallow) Learning
Computer Vision
NLP Applications
AI for code

Education

PhD in Computer Science, 2013

Columbia University
MS in Computer Science, 2008

Columbia University
MEng in Telecommunications Engineering, 2007

University of Trento, Italy

NEWS

- We won Best Paper Award at IEEE ICDH! paper July 2024
- The (Computer) Vision of Sports book chapter is out! June 2024
- Granite Code Models paper and models are released opensource! May 2024
- Code Lingua leaderboard is out! April 2024
- 10th Workshop on Computer Vision in Sports (CVSports) @CVPR 2024
- I am serving as Area Chair for ECCV 2024
- Code Lingua paper on evaluating CodeLLMs for translation accepted at ICSE 2024
- We helped IBM win a Tech Emmy Award for AI-ML curation of Sports Highlights @EMMYS 2023!
- 9th Workshop on Computer Vision in Sports (CVSports) @CVPR 2023
- 4th Workshop on Fair, Data Efficient and Trusted Computer Vision (FA.DE.TR.CV) @CVPR 2023
- 8th Workshop on Computer Vision in Sports (CVSports) @CVPR 2022
- Nominated Outstanding Reviewer @CVPR 2021
- I have been appointed Associate Editor for IEEE Transactions on Multimedia (TMM) (2021-2023)!
- 7th Workshop on Computer Vision in Sports (CVSports) @CVPR 2021
- 2nd Workshop on Fair, Data Efficient and Trusted Computer Vision (FA.DE.TR.CV) @CVPR 2021
- NASTransfer paper accepted to AAAI (Feb 2021)
- 6th Workshop on Computer Vision in Sports (CVSports) @CVPR 2020
- Workshop on Fair, Data Efficient and Trusted Computer Vision (FA.DE.TR.CV) @CVPR 2020
- 2nd Workshop on Bias Estimation in Face Analytics (BEFA) @CVPR 2019
- The Diversity in Faces dataset is out. Check it out here (Jan 2019)
- AI Self Portrait in the art gallery of the NIPS Workshop on Machine Learning for Creativity (Dec 2018)
- Our AI Self Portait has been published in the New York Times! (Oct 2018)
- Cognitive Highlights work accepted to IEEE TMM (Sep 2018)
- Cognitive Highlights wins 2018 Best Digital Development at the Yahoo Sports Tech Awards for Wimbledon!
- Workshop on Bias Estimation in Face Analytics (BEFA) @ECCV 2018
- New Face Attributes models coming to WDC Visual Recognition mitigating bias (Feb 2018)
- Cognitive Highlights @US Open! (Sep 2017)
- Presenting Cognitive Highlights (@Wimbledon July 2017) in the CV in Sports Workshop @CVPR 2017
- Food recognition model launched in beta in WDC Visual Recognition (May 2017)
- Cognitive Highlights @Golf Masters! (Apr 2017)
- I am Area Chair for ACM Multimedia 2017 in Multimedia Search and Recommendation track
- Presenting our delicious Food analytics papers @ACM Multimedia 2016
- I am Local Organization Chair and Webmaster for ICMR 2016
- I acted as a Guest Editor for the Neurocomputing Special Issue on Advanced Learning for Large-Scale Heterogeneous Computing 2016
- I am Area Chair for ACM Multimedia 2016 in Multimedia Search and Recommendation track
- Received the outstanding reviewer award from the 2015 International Conference on Multimedia Retrieval (ICMR)

Research Projects

CodeLLM for Generation and Translation

with Rangeet Pan, Rahul Krishna, Boris Sobolev, Divya Sankar, Lambert Pouguem Wassi, Saurabh Sinha, Raju Pavuluri, Ali Reza Ibrahimzada, Mayank Mishra, Rameswar Panda

2023-2024

CodeLingua Leaderboard Granite Code Models huggingface Granite Code Models github

Contrastive Learning for Entity Standardization

with Jiaqing Yuan, Mihir Choudhury, Divya Sankar, Lambert P. Wassi, John Rofrano, Raju Pavuluri, Maja Vukovic

2022-2023

Details Konveyor TCA Project CoSiNES Paper

Knowledge Distillation for Language Models

with Aashka Trivedi, Takuma Udagawa, Rameswar Panda, Yousef El-Kurdi, Bishwaranjan Bhattacharjee

2022-2023

Details KD-NAS Paper EMNLP Paper

Large Scale Neural Architecture Search

with Rameswar Panda, Mayoore Jaiswal, Hui Wu, Kandan Ramakrishnan, Ulrich Finkler, Chun-Fu Chen, Minsik Cho, David Kung, Rogerio S Feris, Bishwaranjan Bhattacharjee

2020-2021

Details NASTransfer Paper NAS Spline Paper

Diversity in Faces

with Nalini Ratha, Rogerio S Feris, John R Smith

2019

Details Blog Post Techcrunch CNBC Venturebeat CNET

Cognitive Highlights

with Dhiraj Joshi, Quoc-Bao Nguyen, Khoi-Nguyen C. Mac, Stephen Hammer, John Kent, John R Smith, Rogerio S Feris

2017-2018 @Masters, Wimbledon and US Open

Details Video 2018 Golf Masters Video 2017 US Open Demo Video Golf Masters Demo Video Wimbledon Demo Video US Open New York Times Fortune NBC News ZDNET CNET Blog Post 1 Blog Post 2

Food Recognition

with Hui Wu, Rosario Uceda-Sosa, Quoc-Bao Nguyen, John R Smith

2016

Details Demo Research Demo Video 1 Research Demo Video 2 Blog Post

Semantic Model Vectors for Extracting Insights from Consumer Images and Videos

with Liangliang Cao, John R Smith, Bert Huang, Lexing Xie, Gang Hua, Apostol Natsev

from 2012

Details Concept Video SMV for Video Event Recognition Project

Fun Projects

AI Self Portrait

with Cicero Nogueira dos Santos, Mauro Martino, Alfio M Gliozzo, John R Smith

2018

Details New York Times page Art Gallery @NeurIPS 2018 ArXiv paper Blog post CNBC ZDNet Phys.org Axios

Selected Publications

2024

Carla Agurto, Michele Merler, Esteban Roitberg, Alan Taitz, Marcos A. Trevisan, Diego E. Shalom, Julian Peller, Lyle W. Ostrow, Indu Navar, Ernest Fraenkel, James Berry, Guillermo A. Cecchi and Raquel Norel. Harnessing Remote Speech Tasks for Early ALS Biomarker Identification. IEEE International Conference on Digital Health (ICDH) 2024. PDF BibTeX
@inproceedings{agurto2024harnessing, title={Harnessing Remote Speech Tasks for Early ALS Biomarker Identification}, author={Carla Agurto and Michele Merler and Esteban Roitberg and Alan Taitz and Marcos A. Trevisan and Diego E. Shalom and Julian Peller and Lyle W. Ostrow and Indu Navar and Ernest Fraenkel and James Berry and Guillermo A. Cecchi and Raquel Norel}, booktitle={ICDH}, year={2024} }
Mayank Mishra, Matt Stallone, Gaoyuan Zhang, Yikang Shen, Aditya Prasad, Adriana Meza Soria, Michele Merler, Parameswaran Selvam, Saptha Surendran, Shivdeep Singh, Manish Sethi, Xuan-Hong Dang, Pengyuan Li, Kun-Lung Wu, Syed Zawad, Andrew Coleman, Matthew White, Mark Lewis, Raju Pavuluri, Yan Koyfman, Boris Lublinsky, Maximilien de Bayser, Ibrahim Abdelaziz, Kinjal Basu, Mayank Agarwal, Yi Zhou, Chris Johnson, Aanchal Goyal, Hima Patel, Yousaf Shah, Petros Zerfos, Heiko Ludwig, Asim Munawar, Maxwell Crouse, Pavan Kapanipathi, Shweta Salaria, Bob Calio, Sophia Wen, Seetharami Seelam, Brian Belgodere, Carlos Fonseca, Amith Singhee, Nirmit Desai, David D. Cox, Ruchir Puri, Rameswar Panda. Granite Code Models: A Family of Open Foundation Models for Code Intelligence. arXiv (arXiv) 2024. arXiv BibTeX
@misc{mishra2024granite, title={Granite Code Models: A Family of Open Foundation Models for Code Intelligence}}, author={Mayank Mishra and Matt Stallone and Gaoyuan Zhang and Yikang Shen and Aditya Prasad and Adriana Meza Soria and Michele Merler and Parameswaran Selvam and Saptha Surendran and Shivdeep Singh and Manish Sethi and Xuan-Hong Dang and Pengyuan Li and Kun-Lung Wu and Syed Zawad and Andrew Coleman and Matthew White and Mark Lewis and Raju Pavuluri and Yan Koyfman and Boris Lublinsky and Maximilien de Bayser and Ibrahim Abdelaziz and Kinjal Basu and Mayank Agarwal and Yi Zhou and Chris Johnson and Aanchal Goyal and Hima Patel and Yousaf Shah and Petros Zerfos and Heiko Ludwig and Asim Munawar and Maxwell Crouse and Pavan Kapanipathi and Shweta Salaria and Bob Calio and Sophia Wen and Seetharami Seelam and Brian Belgodere and Carlos Fonseca and Amith Singhee and Nirmit Desai and David D. Cox and Ruchir Puri and Rameswar Panda}, eprint={2405.04324}, archivePrefix={arXiv}, primaryClass={cs.AI} year={2024} }
Rangeet Pan, Ali Reza Ibrahimzada, Rahul Krishna, Divya Sankar, Lambert Pouguem Wassi, Michele Merler, Boris Sobolev, Raju Pavuluri, Saurabh Sinha and Reyhaneh Jabbarvand. Lost in translation: A study of bugs introduced by large language models while translating code. International Conference on Sofware Engineering (ICSE) 2024. arXiv BibTeX
@inproceedings{pan2024lost, title={Lost in Translation: A Study of Bugs Introduced by Large Language Models while Translating Code}, author={Rangeet Pan and Ali Reza Ibrahimzada and Rahul Krishna and Divya Sankar and Lambert Pouguem Wassi and Michele Merler and Boris Sobolev and Raju Pavuluri and Saurabh Sinha and Reyhaneh Jabbarvand}, booktitle={ICSE}, year={2024} }
Rikke Gade, Michele Merler, Graham Thomas and Thomas B Moeslund. The (Computer) Vision of Sports: Recent Trends in Research and Commercial Systems for Sport Analytics. Computer Vision: Challenges, Trends, and Opportunities (CRC Press) 2024. book preview BibTeX
@inbook{gade2024sports, title={The (Computer) Vision of Sports: Recent Trends in Research and Commercial Systems for Sport Analytics}, author={Rikke Gade and Michele Merler and Graham Thomas and Thomas B Moeslund}, chapter={Computer Vision: Challenges, Trends, and Opportunities}, publisher={CRC Press}, year={2024} }

2023

Masayasu Muraoka, Bishwaranjan Bhattacharjee, Michele Merler, Graeme Blackwood, Yulong Li, and Yang Zhao. Cross-Lingual Transfer of Large Language Model by Visually-Derived Supervision Toward Low-Resource Languages. ACM Multimedia (MM ) 2023. PDF BibTeX
@inproceedings{Crosslingual_MM2023, author = {Muraoka, Masayasu and Bhattacharjee, Bishwaranjan and Merler, Michele and Blackwood, Graeme and Li, Yulong and Zhao, Yang}, title = {Cross-Lingual Transfer of Large Language Model by Visually-Derived Supervision Toward Low-Resource Languages}, booktitle = {Proceedings of the 31st ACM International Conference on Multimedia}, year = {2023}, }
Takuma Udagawa, Aashka Trivedi, Michele Merler and Bishwaranjan Bhattacharjee. A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models . EMNLP Industry Track (EMNLP) 2023. arXiv BibTeX
@inproceedings{udagawa-etal-2023-comparative, author = {Udagawa, Takuma and Trivedi, Aashka and Merler, Michele and Bhattacharjee, Bishwaranjan}, title = {A Comparative Analysis of Task-Agnostic Distillation Methods for Compressing Transformer Language Models}, booktitle = {Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track}, year = {2023} }
Jiaqing Yuan, Michele Merler, Mihir Choudhury, Raju Pavuluri, Munindar P. Singh and Maja Vukovic. CoSiNES: Contrastive Siamese Network for Entity Standardization . ACL Matching Workshop (ACLW) 2023. arXiv code BibTeX
@inproceedings{cosines_acl2023, author = {Jiaqing Yuan and Michele Merler and Mihir Choudhury and Raju Pavuluri and Munindar P. Singh and Maja Vukovic}, title = {CoSiNES: Contrastive Siamese Network for Entity Standardization}, booktitle = {Matching Workshop at ACL}, year = {2023} }

2022

Aashka Trivedi, Takuma Udagawa, Michele Merler, Rameswar Panda, Yousef El-Kurdi and Bishwaranjan Bhattacharjee. Neural Architecture Search for Effective Teacher-Student Knowledge Transfer in Language Models. arXiv (arXiv) 2022. arXiv BibTeX
@inproceedings{KDNAS_ARXIV2022, author = {Aashka Trivedi and Takuma Udagawa and Michele Merler and Rameswar Panda and Yousef El-Kurdi and Bishwaranjan Bhattacharjee}, title = {Neural Architecture Search for Effective Teacher-Student Knowledge Transfer in Language Models}, booktitle = {arXiv}, year = {2022} }

2021

Rameswar Panda, Michele Merler, Mayoore Jaiswal, Hui Wu, Kandan Ramakrishnan, Ulrich Finkler, Chun-Fu Chen, Minsik Cho, David Kung, Rogerio S Feris, Bishwaranjan Bhattacharjee. NASTransfer: Analyzing Architecture Transferability in Large Scale Neural Architecture Search. 35th AAAI Conference on Artificial Intelligence (AAAI) 2021. arXiv BibTeX
@inproceedings{NASTransfer_AAAI2021, author = {Rameswar Panda and Michele Merler and Mayoore Jaiswal and Hui Wu and Kandan Ramakrishnan and Ulrich Finkler and Chun-Fu Chen and Minsik Cho and David Kung and Rog{\'{e}}rio S Feris and Bishwaranjan Bhattacharjee}, title = {NASTransfer: Analyzing Architecture Transferability in Large Scale Neural Architecture Search}, booktitle = {AAAI}, year = {2021} }

2020

Ulrich Finkler, Michele Merler, Rameswar Panda, Mayoore S Jaiswal, Hui Wu, Kandan Ramakrishnan, Chun-Fu Chen, Minsik Cho, David Kung, Rogerio Feris, Bishwaranjan Bhattacharjee. Large Scale Neural Architecture Search with Polyharmonic Splines. arXiv (arXiv) 2020. arXiv BibTeX
@inproceedings{NASsplines_ARXIV2020, author = {Ulrich Finkler and Michele Merler and Rameswar Panda and Mayoore S Jaiswal and Hui Wu, Kandan Ramakrishnan and Chun-Fu Chen and Minsik Cho and David Kung and Rogerio Feris and Bishwaranjan Bhattacharjee}, title = {Large Scale Neural Architecture Search with Polyharmonic Splines}, booktitle = {arXiv}, year = {2020} }
Michele Merler, Cicero Nogueira dos Santos, Mauro Martino, Alfio M Gliozzo, John R Smith. Covering the News with (AI) Style. arXiv (arXiv) 2020. arXiv BibTeX
@inproceedings{AIportrait_arxiv20, author = {Michele Merler and Cicero Nogueira dos Santos and Mauro Martino and Alfio M Gliozzo and John R Smith}, title = {Covering the News with (AI) Style}, booktitle = {arXiv}, year = {2020} }

2019

Michele Merler, Nalini Ratha, Rogerio S Feris, John R Smith. Diversity in Faces . arXiv (arXiv) 2019. arXiv BibTeX Project
@inproceedings{dif_arxiv19, author = {Michele Merler and Nalini Ratha and Rog{\'{e}}rio S Feris and John R Smith}, title = {Diversity in Faces}, booktitle = {arXiv}, year = {2019} }

2018

Michele Merler, Dhiraj Joshi, Quoc-Bao Nguyen, Stephen Hammer, John Kent, Jinjun Xiong, Minh N. Do, John R Smith, Rogerio S Feris. Automatic Curation of Sports Highlights using Multimodal Excitement Features. IEEE Transactions on MultiMedia (TMM) 2018. PDF BibTeX Project
@inproceedings{Merler_TMM18, author = {Michele Merler and Dhiraj Joshi and Quoc-Bao Nguyen and Stephen Hammer and John Kent and Jinjun Xiong and Minh N. Do and John R Smith and Rog{\'{e}}rio S Feris}, title = {Automatic Curation of Sports Highlights using Multimodal Excitement Features}, booktitle = {{IEEE} Transactions on MultiMedia}, year = {2018} }

2017

Michele Merler, Dhiraj Joshi, Quoc-Bao Nguyen, Stephen Hammer, John Kent, John R Smith, Rogerio S Feris. Automatic Curation of Golf Highlights using Multimodal Excitement Features. 3rd Workshop of Computer Vision in Sports @CVPR (CVPRW) 2017. PDF BibTeX Slides Project
@inproceedings{Merler_CVPRW17, author = {Michele Merler and Dhiraj Joshi and Quoc-Bao Nguyen and Stephen Hammer and John Kent and John R Smith and Rog{\'{e}}rio S Feris}, title = {Automatic Curation of Golf Highlights Using Multimodal Excitement Features}, booktitle = {{IEEE} Conference on Computer Vision and Pattern Recognition Workshops, {CVPR} Workshops}, year = {2017} }
Dhiraj Joshi, Michele Merler , Quoc-Bao Nguyen, Stephen Hammer, John Kent, John R Smith, Rogerio S Feris. IBM High-Five: Highlights From Intelligent Video Engine. ACM Multimedia (MM) 2017. PDF BibTeX Project
@inproceedings{Joshi_MM17, author = {Joshi, Dhiraj and Merler, Michele and Nguyen, Quoc-Bao and Hammer, Stephen and Kent, John and Smith, John R. and Feris, Rogerio S.}, title = {IBM High-Five: Highlights From Intelligent Video Engine}, booktitle = {Proceedings of the ACM on Multimedia Conference}, year = {2017} }
Xiaolong Wang, Guodong Guo, Michele Merler, Noel CF Codella, MV Rohith, John R Smith, Chandra Kambhamettu. Leveraging multiple cues for recognizing family photos. Image and Vision Computing(IVC) 2017. PDF BibTeX
@article{Wang_IVC17, author = {Wang, Xiaolong and Guo, Guodong and Merler, Michele and C. F. Codella, Noel and MV, Rohith and Smith, John R. and Kambhamettu, Chandra}, title = {Leveraging Multiple Cues for Recognizing Family Photos}, journal = {Image Vision Computing}, issue_date = {February 2017}, volume = {58}, number = {C}, year = {2017}, pages = {61--75} }

2016

Michele Merler, Hui Wu, Rosario Uceda-Sosa, Quoc-Bao Nguyen, John R Smith. Snap, Eat, RepEat: a food recognition engine for dietary logging. 2nd International Workshop on Multimedia Assisted Dietary Management @ACM Multimedia (MADIMA) 2016. PDF BibTeX Project Slides Poster
@inproceedings{MerlerHUNS16, author = {Michele Merler and Hui Wu and Rosario Uceda{-}Sosa and Quoc{-}Bao Nguyen and John R Smith}, title = {Snap, eat, repEat: a Food Recognition Engine for Dietary Logging}, booktitle = {2nd International Workshop on Multimedia Assisted Dietary Management (MADIMA), in conjucntion with International Conference on Multimedia, {MM} 2016}, year = {2016} }
Hui Wu, Michele Merler, Rosario Uceda-Sosa, John R Smith. Learning to make better mistakes: Semantics-aware visual food recognition. ACM Multimedia (MM) 2016. PDF BibTeX Project
@inproceedings{Wu_MM16, author = {Hui Wu and Michele Merler and Rosario Uceda{-}Sosa and John R Smith}, title = {Learning to Make Better Mistakes: Hierarchical Food Recognition in the Wild}, booktitle = {2016 {ACM} International Conference on Multimedia, {MM} 2016}, year = {2016} }

2015

Michele Merler, Liangliang Cao, John R Smith. You are what you tweet… pic! gender prediction based on semantic analysis of social media images. IEEE International on Conference on Multimedia and Expo (ICME) 2015. PDF BibTeX Slides
@inproceedings{Merler_ICME15, author = {Michele Merler and Liangliang Cao and John R Smith}, title = {You are what you tweet...pic! Gender prediction based on semantic analysis of social media images}, booktitle = {2015 IEEE International Conference on Multimedia and Expo (ICME)}, pages = {1 - 6} year = {2015} }
Junjie Cai, Michele Merler, Sharath Pankanti, Qi Tian. Heterogeneous semantic level features fusion for action recognition. IEEE International on Conference on Multimedia Retrieval (ICMR) 2015. PDF BibTeX
@inproceedings{Cai_ICMR15, author = {Cai, Junjie and Merler, Michele and Pankanti, Sharath and Tian, Qi}, title = {Heterogeneous Semantic Level Features Fusion for Action Recognition}, booktitle = {Proceedings of the 5th ACM on International Conference on Multimedia Retrieval}, series = {ICMR '15}, pages = {307--314}, year = {2015} }
Mani Abedini, Noel CF Codella, Jonathan H Connell, Rahil Garnavi, Michele Merler, Sharath Pankanti, John R Smith, Tanveer Syeda-Mahmood A generalized framework for medical image classification and recognition. IBM Journal of Research and Development(IBM-JRD) 2015. PDF BibTeX
@article{Abedini_JRD15, author={M. Abedini and N. C. F. Codella and J. H. Connell and R. Garnavi and M. Merler and S. Pankanti and J. R. Smith and T. Syeda-Mahmood}, journal={IBM Journal of Research and Development}, title={A generalized framework for medical image classification and recognition}, year={2015}, volume={59}, number={2/3}, pages={1:1-1:18} }

2014

Felix X Yu, Liangliang Cao, Michele Merler, Noel Codella, Tao Chen, John R Smith, Shih-Fu Chang. Modeling attributes from category-attribute proportions. ACM Multimedia (MM) 2014. PDF BibTeX
@inproceedings{Yu_MM14, author = {Yu, Felix X. and Cao, Liangliang and Merler, Michele and Codella, Noel and Chen, Tao and Smith, John R. and Chang, Shih-Fu}, title = {Modeling Attributes from Category-Attribute Proportionsd}, booktitle = {2016 {ACM} International Conference on Multimedia, {MM} 2014}, pages = {977--980}, year = {2014} }
Noel Codella, Jonathan Connell, Sharath Pankanti, Michele Merler, John R Smith. Automated medical image modality recognition by fusion of visual and text information. International Conference on Medical Image Computing and Computer-Assisted Intervention(MICCAI) 2014. PDF BibTeX CLEF13 Slides
@inproceedings{Codella_MICCAI14, author = {Noel Codella and Jonathan Connell and Sharath Pankanti and Michele Merler and John R Smith}, title = {Automated Medical Image Modality Recognition by Fusion of Visual and Text Information}, booktitle = {Medical Image Computing and Computer-Assisted Intervention -- MICCAI 2014: 17th International Conference, Boston, MA, USA, September 14-18, 2014, Proceedings, Part II}, pages = {487--495}, year = {2014} }

PRE-2013

Michele Merler, Bert Huang, Lexing Xie, Gang Hua, Apostol Natsev. Semantic model vectors for complex video event recognition. IEEE Transactions on Multimedia (TMM) 2012. PDF BibTeX Project
@article{Merler_TMM12, author = {Michele Merler and Bert Huang and Lexing Xie and Gang Hua and Apostol Natsev}, title = {Semantic Model Vectors for Complex Video Event Recognition}, journal = {IEEE Transactions on Multimedia}, volume = {14}, number = {1}, pages = {88-101}, year = {2012} }
Michele Merler, Rong Yan, John R Smith. Imbalanced rankboost for efficiently ranking large-scale image/video collections. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2009. PDF BibTeX Poster
@inproceedings{Merler_CVPR09, author = {Michele Merler and Rong Yan and John R Smith}, title = {Imbalanced rankboost for efficiently ranking large-scale image/video collections}, booktitle = {IEEE International Conference on Computer Vision and Pattern Recognition (CVPR)}, year = {2009} }
Rong Yan, Marc-Olivier Fleury, Michele Merler, Apostol Natsev, John R. Smith. Large-Scale Multimedia Semantic Concept Modeling using Robust Subspace Bagging and MapReduce. First ACM workshop on Large-scale multimedia retrieval and mining @ACM Multimedia (LS-MMRM) 2009. PDF BibTeX
@inproceedings{Yan_MM09, author = {Rong Yan and Marc-Olivier Fleury and Michele Merler and Apostol Natsev and John R. Smith}, title = {Large-scale multimedia semantic concept modeling using robust subspace bagging and MapReduce}, booktitle = {Proceedings of the First ACM workshop on Large-scale multimedia retrieval and mining}, series = {LS-MMRM '09}, year = {2009} }
Michele Merler, John R Kender. Semantic keyword extraction via adaptive text binarization of unstructured unsourced video. IEEE International Conference on Image Processing (ICIP) 2009. PDF BibTeX Poster
@inproceedings{Merler_ICIP09, author = {Michele Merler and John R Kender}, title = {Semantic keyword extraction via adaptive text binarization of unstructured unsourced video}, booktitle = {IEEE International Conference on Image Processing (ICIP)}, year = {2009} }
Michele Merler, Carolina Galleguillos, Serge Belongie. Recognizing groceries in situ using in vitro training data. 2nd International Workshop on Semantic Learning Applications in Multimedia @CVPR (SLAM) 2007. PDF BibTeX Project
@inproceedings{Merler_CVPRW07, author = {Michele Merler and Carolina Galleguillos and Serge Belongie}, title = {Recognizing groceries in situ using in vitro training data}, booktitle = {{IEEE} Conference on Computer Vision and Pattern Recognition Workshops, {CVPR} Workshops}, year = {2007} }

PATENTS

Jiaqing Yuan, Michele Merler, Mihir Choudhury, Venkata Nagaraju Pavuluri, Maja Vukovic. Entity standardization for application modernization. US Patent App. 18/160,301 2024 GooglePatents
Michele Merler, Paul Pritz. Attribute-based calibration for machine learning. US Patent App. 17/977,880 2024 GooglePatents
Anup Kalia, Mihir Choudhury, Jin Xiao, Divya Sankar, John Rofrano, Venkata Nagaraju Pavuluri, Lambert Pouguem Wassi, Maja Vukovic, Michele Merler. Adaptable and explainable application modernization disposition. US Patent App. 18/071,911 2024 GooglePatents
Michele Merler, Dhiraj Joshi, Apurv Gupta, Sebastien Gilbert, Shyama Prosad Chowdhury, Chidansh Amitkumar Bhatt, Nirmit V Desai. AI System and Method for Automatic Analog Gauge Reading. US Patent App. 17/936,519 2024 GooglePatents
Sebastien Gilbert, Michele Merler, Dhiraj Joshi, Apurv Gupta, Shyama Prosad Chowdhury, Chidansh Amitkumar Bhatt, Nirmit V Desai. Oblique Image Rectification. US Patent App. 18/048,975 2024 FreePatentsOnline
Dinesh C Verma, Franck Vinh Le, Michele Merler, Dhiraj Joshi, Supriyo Chakraborty, Seraphin Bernard Calo. Knowledge expansion for improving machine learning. US Patent App. 17/935,198 2024 GooglePatents
Raghu Kiran Ganti, Mudhakar Srivatsa, Shreeranjani Srirangamsridharan, Jae-Wook Ahn, Michele Merler, Dean Steuer. Transparent and controllable topic modeling. US Patent 11,941,038 2024 GooglePatents
Michele Merler, Aashka Trivedi, Rameswar Panda, Bishwaranjan Bhattacharjee, Taesun Moon, Avirup Sil. Neural architecture search of language models using knowledge distillation. US Patent App. 17/075,963 2022 GooglePatents
Ulrich Alfons Finkler, Michele Merler, Mayoore Selvarasa Jaiswal, Hui Wu, Rameswar Panda, Wei Zhang. Configuring a neural network using smoothing splines. US Patent App. 17/075,963 2022 GooglePatents
Michele Merler, Mauro Martino, Cicero Nogueira dos Santos, Alfio Massimiliano Gliozzo, John R. Smith. Automatic generation of content using multimedia. US Patent 11,170,270 2021 GooglePatents
Michele Merler, Dhiraj Joshi, Quoc-Bao Nguyen, Stephen C Hammer, John Joseph Kent, John R Smith, Rogerio Feris. Auto-curation and personalization of sports highlights. US Patent 10,595,101 2020 GooglePatents
Michele Merler, Jae-Eun Park, John R Smith, Rosario Uceda-Sosa. Individual and user group attributes discovery and comparison from social media visual content. US Patent 10,282,677 2019 GooglePatents
Michele Merler, John R Smith, Rosario Uceda-Sosa, Hui Wu. Image classification utilizing semantic relationships in a classification hierarchy. US Patent 9,928,448 2018 GooglePatents
Liangliang Cao, Michele Merler, John R Smith. Systems and methods for inferring gender by fusion of multimodal content. US Patent 9,684,852 2017 GooglePatents
Michele Merler, John R Kender. Kalman filter approach to augment object tracking. US Patent 9,177,229 2015 GooglePatents

Professional Activities

Area Chair at ECCV 2024
Associate Editor for the IEEE Transactions on Multimedia (TMM) 2021-2023
Organizer for the Workshop on Computer Vision in Sports (CVSports) @CVPR 2020-2023
Organizer for the Workshop on Fair, Data Efficient and Trusted Computer Vision (FA.DE.TR.CV) @CVPR 2020-2023
Program Chair for the Second International Workshop on Bias Estimation in Face Analytics (BEFA) @CVPR 2019
Program Chair for the International Conference on Image Analysis and Processing (ICIAP) 2019
Demo Chair for the International Conference on MultiMedia Modeling (MMM) 2019
Program Chair for First International Workshop on Bias Estimation in Face Analytics (BEFA) @ECCV 2018
Area Chair at ACM Multimedia 2016, 2017. Multimedia Search and Recommendation track
Registration Chair at VCIP 2017
Guest Editor for the Neurocomputing Special Issue on Advanced Learning for Large-Scale Heterogeneous Computing 2016
Local Arragement Chair and Web Chair at ICMR 2016
Panelist at National Science Foundation (NSF), IIS Division 2015
Co-organizer of the Greater New York Multimedia and Vision Meeting (GNYMV) 2012-2014
TPC for CVPR, ECCV, ICCV, ACM Multimedia, NeurIPS, AAAI, ICMR, FG, ICME, ICPR
Reviewer for IEEE TMM, IEEE TPAMI, IEEE TIP, CVIU, TOMCCAP, JVCI, TCSVT, JVCIR
Member of IEEE, IEEE Circuit and Systems Society, New York Academy of Sciences

Honors and Awards

Best Paper Award for "Harnessing Remote Speech Tasks for Early ALS Biomarker Identification" (IEEE ICDH 2024)
Technology & Engineering Emmy Award (AI-ML curation of Sports Highlights) (2023)
IBM Outstanding Technical Achievement (Konveyor Open-Source Community) (2023)
Outstanding Reviewer (CVPR 2021)
IBM Corporate Award (AI Video Enrichment and Editing) (2019)
IBM Research Division Award (Watson Visual Recognition Services Contributions) (2018)
Best Digital Development from Yahoo Sports Tech Awards for Wimbledon Cognitive Highlights (2018)
Best Reviewer Award (ICMR 2015)
First Place in ImageCLEF Medical Image Modality Classification (2013)
IBM Research Division Award (Multimedia Semantic Modeling) (2013)
IBM Outstanding Technical Accomplishment (Multimedia Group) (2012)
IBM Eminence and Excellence Award (Greater New York Multimedia and Vision Workshop) (2012)
First Place in ImageCLEF Medical Image Modality Classification (2012)
Yahoo! Key Scientific Challenge Award (2009)
VideOlympics "People's Choice Award" (group) for IMARS Multimedia Retrieval System, ACM CIVR (2008)
California Institute of Information and Telecommunications Technology Summer Undergraduate Research Scholarship (2006)