Matei Zaharia's profile photo

Matei Zaharia

Berkeley

Chief Technology Officer and Co-Founder at Databricks

CTO at @Databricks and CS prof at @UCBerkeley. Working on data+AI, including @ApacheSpark, @DeltaLakeOSS, @MLflow, https://t.co/94gROE5Xa0. https://t.co/nmRYAKG0LZ

Articles

  • 2 months ago | arxiv.org | Jan Luca |Matei Zaharia |Christopher Potts |Gustavo Alonso

    arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website. Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them. Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

  • Nov 13, 2024 | databricks.com | Naveen Rao |Matei Zaharia |Patrick Wendell |Eric Peter

    Monolithic to ModularThe proof of concept (POC) of any new technology often starts with large, monolithic units that are difficult to characterize. By definition, POCs are designed to show that a technology works without considering issues around extensibility, maintenance, and quality. However, once technologies achieve maturity and are deployed widely, these needs drive product development to be broken down into smaller, more manageable units.

  • Oct 8, 2024 | databricks.com | Quinn Leng |Jacob P. Portes |Sam Havens |Matei Zaharia

    Retrieval Augmented Generation (RAG) is the top use case for Databricks customers who want to customize AI workflows on their own data. The pace of large language model releases is incredibly fast, and many of our customers are looking for up-to-date guidance on how to build the best RAG pipelines. In a previous blog post, we ran over 2,000 long context RAG experiments on 13 popular open source and commercial LLMs to uncover their performance on various domain-specific datasets.

  • Oct 2, 2024 | databricks.com | Linqing Liu |Matthew Hayes |Ritendra Datta |Matei Zaharia

    IntroductionApplying Large Language Models (LLMs) for code generation is becoming increasingly prevalent, as it helps you code faster and smarter. A primary concern with LLM-generated code is its correctness. Most open-source coding benchmarks are designed to evaluate general coding skills. But, in enterprise environments, the LLMs must be capable not only of general programming but also of utilizing domain-specific libraries and tools, such as MLflow and Spark SQL.

  • Aug 12, 2024 | databricks.com | Quinn Leng |Jacob P. Portes |Sam Havens |Matei Zaharia

    Retrieval Augmented Generation (RAG) is  the most widely adopted generative AI use case among our customers. RAG enhances the accuracy of LLMs by retrieving information from external sources such as unstructured documents or structured data.

Contact details

Socials & Sites

Try JournoFinder For Free

Search and contact over 1M+ journalist profiles, browse 100M+ articles, and unlock powerful PR tools.

Start Your 7-Day Free Trial →

X (formerly Twitter)

Followers
42K
Tweets
2K
DMs Open
No
Matei Zaharia
Matei Zaharia @matei_zaharia
22 Apr 25

RT @databricks: There’s still time to join tomorrow’s webinar with Databricks CEO @AliGhodsi, @AnthropicAI CEO @DarioAmodei, and other indu…

Matei Zaharia
Matei Zaharia @matei_zaharia
22 Apr 25

RT @lateinteraction: This was super cool work from Tomu Hirata and the rest of the DSPy and MLflow OSS teams at @databricks. Understanding…

Matei Zaharia
Matei Zaharia @matei_zaharia
19 Apr 25

RT @tqchenml: Less than a month to go before #MLSys2025. We have a great line of keynote speakers @soumithchintala @istoica05 @AnimaAnandku…