publications

publications by categories in reversed chronological order. generated by jekyll-scholar.

  1. Semdedup: Data-efficient learning at web-scale through semantic deduplication
    Amro Abbas, Kushal Tirumala, Dániel Simig, and 2 more authors
    Best Paper Award @ The Multimodal Representation Learning Workshop, at ICLR 2023,
  2. Datacomp-lm: In search of the next generation of training sets for language models
    Jeffrey Li*, Alex Fang*, Georgios Smyrnis*, and 8 more authors
    NeurIPS 2024,
  3. Effective pruning of web-scale datasets based on complexity of concept clusters
    Amro Abbas*, Evgenia Rusak*, Kushal Tirumala, and 3 more authors
    ICLR 2024 Oral Presentation @ DataComp Workshop at ICCV, 2023 ,
  4. Progress and limitations of deep networks to recognize objects in unusual poses
    Amro Abbas and Stéphane Deny
    AAAI 2023,
  5. Sieve: Multimodal dataset pruning using image captioning models
    Anas Mahmoud, Mostafa Elhoushi, Amro Abbas, and 4 more authors
    ICLR 2024,
  6. A comparison between humans and AI at recognizing objects in unusual poses
    Netta Ollikka, Amro Abbas, Andrea Perin, and 2 more authors
    TMLR 2024,