Amro Abbas

Datologyai | 📍 Bay Area

prof_pic.jpg

📊 It's always a multi-step, data-driven, literature-cited investigation.

🖥 If it can be parallelized on a GPU cluster, that’s art.

I am a founding member of the technical staff at DatologyAI, where I spend my time conducting research aimed at understanding how deep models learn from data. Prior to that, I was an AI Resident at Meta AI working with Ari Morcos. I hold a masters’ in Math and Computer Science from the African Institute For Mathematical Sciences (AIMS) (the African Master’s in Machine Intelligence (AMMI)).

Do not hesitate to reach out for collabiration.

LinkedIn Twitter Google Scholar GitHub

news

Mar 01, 2024 Joined DatologyAI as a founding member of technical staff.
Jan 01, 2023 Our paper (SemDeDup: Data-efficient learning at web-scale through semantic deduplication) received the Best Paper Award @ The Multimodal Representation Learning Workshop at ICLR 2023.
Nov 01, 2022 New paper accepted at AAAI2023 (Progress and limitations of deep networks to recognize objects in unusual poses).
Sep 01, 2022 I joined Meta AI in California as an AI Resident. I will be working with Ari Morcos’s team.
Oct 01, 2020 I completed my BSc at University of Khartoum / Electrical Engineering (major in software engineering)

selected publications

  1. Semdedup: Data-efficient learning at web-scale through semantic deduplication
    Amro Abbas, Kushal Tirumala, Dániel Simig, and 2 more authors
    Best Paper Award @ The Multimodal Representation Learning Workshop, at ICLR 2023,
  2. Datacomp-lm: In search of the next generation of training sets for language models
    Jeffrey Li*, Alex Fang*, Georgios Smyrnis*, and 8 more authors
    NeurIPS 2024,
  3. Effective pruning of web-scale datasets based on complexity of concept clusters
    Amro Abbas*, Evgenia Rusak*, Kushal Tirumala, and 3 more authors
    ICLR 2024 Oral Presentation @ DataComp Workshop at ICCV, 2023 ,
  4. Progress and limitations of deep networks to recognize objects in unusual poses
    Amro Abbas and Stéphane Deny
    AAAI 2023,
  5. Sieve: Multimodal dataset pruning using image captioning models
    Anas Mahmoud, Mostafa Elhoushi, Amro Abbas, and 4 more authors
    ICLR 2024,
  6. A comparison between humans and AI at recognizing objects in unusual poses
    Netta Ollikka, Amro Abbas, Andrea Perin, and 2 more authors
    TMLR 2024,